std: use `sync::Mutex` for internal statics
Since `sync::Mutex` is now `const`-constructible, it can be used for internal statics, removing the need for `sys_common::StaticMutex`. This adds some extra allocations on platforms which need to box their mutexes (currently SGX and some UNIX), but these will become unnecessary with the lock improvements tracked in #93740.
I changed the program argument implementation on Hermit, it does not need `Mutex` but can use atomics like some UNIX systems (ping `@mkroening` `@stlankes).`
Use semaphores for thread parking on Apple platforms
Currently we use a mutex-condvar pair for thread parking on Apple systems. Unfortunately, `pthread_cond_timedwait` uses the real-time clock for measuring time, which causes problems when the system time changes. The parking implementation in this PR uses a semaphore instead, which measures monotonic time by default, avoiding these issues. As a further benefit, this has the potential to improve performance a bit, since `unpark` does not need to wait for a lock to be released.
Since the Mach semaphores are poorly documented (I could not find availability or stability guarantees for instance), this uses a [dispatch semaphore](https://developer.apple.com/documentation/dispatch/dispatch_semaphore?language=objc) instead. While it adds a layer of indirection (it uses Mach semaphores internally), the overhead is probably negligible.
Tested on macOS 12.5.
r? ``````@thomcc``````
sync thread_local key conditions exactly with what the macro uses
This makes the `cfg` in `mod.rs` syntactically the same as those in `local.rs`.
I don't think this should actually change anything, but seems better to be consistent?
I looked into this due to https://github.com/rust-lang/rust/issues/102549, but this PR would make it *less* likely that `__OsLocalKeyInner` is going to get provided, so this cannot help with that issue.
r? `@thomcc`
scoped threads: pass closure through MaybeUninit to avoid invalid dangling references
The `main` function defined here looks roughly like this, if it were written as a more explicit stand-alone function:
```rust
// Not showing all the `'lifetime` tracking, the point is that
// this closure might live shorter than `thread`.
fn thread(control: ..., closure: impl FnOnce() + 'lifetime) {
closure();
control.signal_done();
// A lot of time can pass here.
}
```
Note that `thread` continues to run even after `signal_done`! Now consider what happens if the `closure` captures a reference of lifetime `'lifetime`:
- The type of `closure` is a struct (the implicit unnameable closure type) with a `&'lifetime mut T` field. References passed to a function are marked with `dereferenceable`, which is LLVM speak for *this reference will remain live for the entire duration of this function*.
- The closure runs, `signal_done` runs. Then -- potentially -- this thread gets scheduled away and the main thread runs, seeing the signal and returning to the user. Now `'lifetime` ends and the memory the reference points to might be deallocated.
- Now we have UB! The reference that as passed to `thread` with the promise of remaining live for the entire duration of the function, actually got deallocated while the function still runs. Oops.
Long-term I think we should be able to use `ManuallyDrop` to fix this without `unsafe`, or maybe a new `MaybeDangling` type. I am working on an RFC for that. But in the mean time it'd be nice to fix this so that Miri with `-Zmiri-retag-fields` (which is needed for "full enforcement" of all the LLVM flags we generate) stops erroring on scoped threads.
Fixes https://github.com/rust-lang/rust/issues/101983
r? `@m-ou-se`
Update docs so that deprecated method points to relevant method
The docs for the deprecated 'park_timeout_ms' method suggests that the user 'use park_timeout' method instead (at https://doc.rust-lang.org/std/thread/index.html).
Making a similar change so that the docs for the deprecated `sleep_ms` method suggest that the user `use sleep` method instead.
Add cgroupv1 support to available_parallelism
Fixes#97549
My dev machine uses cgroup v2 so I was only able to test that code path. So the v1 code path is written only based on documentation. I could use some help testing that it works on a machine with cgroups v1:
```
$ x.py build --stage 1
# quota.rs
fn main() {
println!("{:?}", std:🧵:available_parallelism());
}
# assuming stage1 is linked in rustup
$ rust +stage1 quota.rs
# spawn a new cgroup scope for the current user
$ sudo systemd-run -p CPUQuota="300%" --uid=$(id -u) -tdS
# should print Ok(3)
$ ./quota
```
If it doesn't work as expected an strace, the contents of `/proc/self/cgroups` and the structure of `/sys/fs/cgroups` would help.
Remove `#[rustc_deprecated]`
This removes `#[rustc_deprecated]` and introduces diagnostics to help users to the right direction (that being `#[deprecated]`). All uses of `#[rustc_deprecated]` have been converted. CI is expected to fail initially; this requires #95958, which includes converting `stdarch`.
I plan on following up in a short while (maybe a bootstrap cycle?) removing the diagnostics, as they're only intended to be short-term.
Mutex and Condvar are being replaced by more efficient implementations, which need thread parking themselves (see #93740). Therefore use the pthread synchronization primitives directly. Also, avoid allocating because the Parker struct is being placed in an Arc anyways.
Use modern formatting for format! macros
This updates the standard library's documentation to use the new format_args syntax.
The documentation is worthwhile to update as it should be more idiomatic
(particularly for features like this, which are nice for users to get acquainted
with). The general codebase is likely more hassle than benefit to update: it'll
hurt git blame, and generally updates can be done by folks updating the code if
(and when) that makes things more readable with the new format.
A few places in the compiler and library code are updated (mostly just due to
already having been done when this commit was first authored).
`eprintln!("{}", e)` becomes `eprintln!("{e}")`, but `eprintln!("{}", e.kind())` remains untouched.
This updates the standard library's documentation to use the new syntax. The
documentation is worthwhile to update as it should be more idiomatic
(particularly for features like this, which are nice for users to get acquainted
with). The general codebase is likely more hassle than benefit to update: it'll
hurt git blame, and generally updates can be done by folks updating the code if
(and when) that makes things more readable with the new format.
A few places in the compiler and library code are updated (mostly just due to
already having been done when this commit was first authored).
Remove argument from closure in thread::Scope::spawn.
This implements ```@danielhenrymantilla's``` [suggestion](https://github.com/rust-lang/rust/issues/93203#issuecomment-1040798286) for improving the scoped threads interface.
Summary:
The `Scope` type gets an extra lifetime argument, which represents basically its own lifetime that will be used in `&'scope Scope<'scope, 'env>`:
```diff
- pub struct Scope<'env> { .. };
+ pub struct Scope<'scope, 'env: 'scope> { .. }
pub fn scope<'env, F, T>(f: F) -> T
where
- F: FnOnce(&Scope<'env>) -> T;
+ F: for<'scope> FnOnce(&'scope Scope<'scope, 'env>) -> T;
```
This simplifies the `spawn` function, which now no longer passes an argument to the closure you give it, and now uses the `'scope` lifetime for everything:
```diff
- pub fn spawn<'scope, F, T>(&'scope self, f: F) -> ScopedJoinHandle<'scope, T>
+ pub fn spawn<F, T>(&'scope self, f: F) -> ScopedJoinHandle<'scope, T>
where
- F: FnOnce(&Scope<'env>) -> T + Send + 'env,
+ F: FnOnce() -> T + Send + 'scope,
- T: Send + 'env;
+ T: Send + 'scope;
```
The only difference the user will notice, is that their closure now takes no arguments anymore, even when spawning threads from spawned threads:
```diff
thread::scope(|s| {
- s.spawn(|_| {
+ s.spawn(|| {
...
});
- s.spawn(|s| {
+ s.spawn(|| {
...
- s.spawn(|_| ...);
+ s.spawn(|| ...);
});
});
```
<details><summary>And, as a bonus, errors get <em>slightly</em> better because now any lifetime issues point to the outermost <code>s</code> (since there is only one <code>s</code>), rather than the innermost <code>s</code>, making it clear that the lifetime lasts for the entire <code>thread::scope</code>.
</summary>
```diff
error[E0373]: closure may outlive the current function, but it borrows `a`, which is owned by the current function
--> src/main.rs:9:21
|
- 7 | s.spawn(|s| {
- | - has type `&Scope<'1>`
+ 6 | thread::scope(|s| {
+ | - lifetime `'1` appears in the type of `s`
9 | s.spawn(|| println!("{:?}", a)); // might run after `a` is dropped
| ^^ - `a` is borrowed here
| |
| may outlive borrowed value `a`
|
note: function requires argument type to outlive `'1`
--> src/main.rs:9:13
|
9 | s.spawn(|| println!("{:?}", a)); // might run after `a` is dropped
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
help: to force the closure to take ownership of `a` (and any other referenced variables), use the `move` keyword
|
9 | s.spawn(move || println!("{:?}", a)); // might run after `a` is dropped
| ++++
"
```
</details>
The downside is that the signature of `scope` and `Scope` gets slightly more complex, but in most cases the user wouldn't need to write those, as they just use the argument provided by `thread::scope` without having to name its type.
Another downside is that this does not work nicely in Rust 2015 and Rust 2018, since in those editions, `s` would be captured by reference and not by copy. In those editions, the user would need to use `move ||` to capture `s` by copy. (Which is what the compiler suggests in the error.)
Add diagnostic items for macros
For use in Clippy, it adds diagnostic items to all the stable public macros
Clippy has lints that look for almost all of these (currently by name or path), but there are a few that aren't currently part of any lint, I could remove those if it's preferred to add them as needed rather than ahead of time
Simpilfy thread::JoinInner.
`JoinInner`'s `native` field was an `Option`, but that's unnecessary.
Also, thanks to `Arc::get_mut`, there's no unsafety needed in `JoinInner::join()`.
It now panic!()s on its own, rather than resume_unwind'ing the panic
payload from the thread. Using resume_unwind skips the panic_handler,
meaning that the main thread would never have a panic handler run, which
can get confusing.
Clarify the guarantees that ThreadId does and doesn't make.
The existing documentation does not spell out whether `ThreadId`s are unique during the lifetime of a thread or of a process. I had to examine the source code to realise (pleasingly!) that they're unique for the lifetime of a process. That seems worth documenting clearly, as it's a strong guarantee.
Examining the way `ThreadId`s are created also made me realise that the `as_u64` method on `ThreadId` could be a trap for the unwary on those platforms where the platform's notion of a thread identifier is also a 64 bit integer (particularly if they happen to use a similar identifier scheme to `ThreadId`). I therefore think it's worth being even clearer that there's no relationship between the two.
std: Stabilize the `thread_local_const_init` feature
This commit is intended to follow the stabilization disposition of the
FCP that has now finished in #84223. This stabilizes the ability to flag
thread local initializers as `const` expressions which enables the macro
to generate more efficient code for accessing it, notably removing
runtime checks for initialization.
More information can also be found in #84223 as well as the tests where
the feature usage was removed in this PR.
Closes#84223
This commit is intended to follow the stabilization disposition of the
FCP that has now finished in #84223. This stabilizes the ability to flag
thread local initializers as `const` expressions which enables the macro
to generate more efficient code for accessing it, notably removing
runtime checks for initialization.
More information can also be found in #84223 as well as the tests where
the feature usage was removed in this PR.
Closes#84223
The "fixed" in "fixed steady state limits" means to exclude load-dependent resource prioritization
that would calculate to 100% of capacity on an idle system and less capacity on a loaded system.
Additionally I also exclude "system load" since it would be silly to try to identify
other, perhaps higher priority, processes hogging some CPU cores that aren't explicitly excluded
by masks/quotas/whatever.
std: Tweak expansion of thread-local const
This commit tweaks the expansion of `thread_local!` when combined with a
`const { ... }` value to help ensure that the rules which apply to
`const { ... }` blocks will be the same as when they're stabilized.
Previously with this invocation:
thread_local!(static NAME: Type = const { init_expr });
this would generate (on supporting platforms):
#[thread_local]
static NAME: Type = init_expr;
instead the macro now expands to:
const INIT_EXPR: Type = init_expr;
#[thread_local]
static NAME: Type = INIT_EXPR;
with the hope that because `init_expr` is defined as a `const` item then
it's not accidentally allowing more behavior than if it were put into a
`static`. For example on the stabilization issue [this example][ex] now
gives the same error both ways.
[ex]: https://github.com/rust-lang/rust/issues/84223#issuecomment-953384298
This commit tweaks the expansion of `thread_local!` when combined with a
`const { ... }` value to help ensure that the rules which apply to
`const { ... }` blocks will be the same as when they're stabilized.
Previously with this invocation:
thread_local!(static NAME: Type = const { init_expr });
this would generate (on supporting platforms):
#[thread_local]
static NAME: Type = init_expr;
instead the macro now expands to:
const INIT_EXPR: Type = init_expr;
#[thread_local]
static NAME: Type = INIT_EXPR;
with the hope that because `init_expr` is defined as a `const` item then
it's not accidentally allowing more behavior than if it were put into a
`static`. For example on the stabilization issue [this example][ex] now
gives the same error both ways.
[ex]: https://github.com/rust-lang/rust/issues/84223#issuecomment-953384298
This commit goes through and updates various `#[cfg]` as appropriate to
get the wasm64-unknown-unknown target behaving similarly to the
wasm32-unknown-unknown target. Most of this is just updating various
conditions for `target_arch = "wasm32"` to also account for `target_arch
= "wasm64"` where appropriate. This commit also lists `wasm64` as an
allow-listed architecture to not have the `restricted_std` feature
enabled, enabling experimentation with `-Z build-std` externally.
The main goal of this commit is to enable playing around with
`wasm64-unknown-unknown` externally via `-Z build-std` in a way that's
similar to the `wasm32-unknown-unknown` target. These targets are
effectively the same and only differ in their pointer size, but wasm64
is much newer and has much less ecosystem/library support so it'll still
take time to get wasm64 fully-fledged.
Add JoinHandle::is_running.
This adds:
```rust
impl<T> JoinHandle<T> {
/// Checks if the the associated thread is still running its main function.
///
/// This might return `false` for a brief moment after the thread's main
/// function has returned, but before the thread itself has stopped running.
pub fn is_running(&self) -> bool;
}
```
The usual way to check if a background thread is still running is to set some atomic flag at the end of its main function. We already do that, in the form of dropping an Arc which will reduce the reference counter. So we might as well expose that information.
This is useful in applications with a main loop (e.g. a game, gui, control system, ..) where you spawn some background task, and check every frame/iteration whether the background task is finished to .join() it in that frame/iteration while keeping the program responsive.
Add #[must_use] to remaining std functions (O-Z)
I've run out of compelling reasons to group functions together across crates so I'm just going to go module-by-module. This is half of the remaining items from the `std` crate, from O-Z.
`panicking::take_hook` has a side effect: it unregisters the current panic hook, returning it. I almost ignored it, but the documentation example shows `let _ = panic::take_hook();`, so following suit I went ahead and added a `#[must_use]`.
```rust
std::panicking fn take_hook() -> Box<dyn Fn(&PanicInfo<'_>) + 'static + Sync + Send>;
```
I added these functions that clippy did not flag:
```rust
std::path::Path fn starts_with<P: AsRef<Path>>(&self, base: P) -> bool;
std::path::Path fn ends_with<P: AsRef<Path>>(&self, child: P) -> bool;
std::path::Path fn with_file_name<S: AsRef<OsStr>>(&self, file_name: S) -> PathBuf;
std::path::Path fn with_extension<S: AsRef<OsStr>>(&self, extension: S) -> PathBuf;
```
Parent issue: #89692
r? `@joshtriplett`
Improve `std:🧵:available_parallelism` docs
_Tracking issue: https://github.com/rust-lang/rust/issues/74479_
This PR reworks the documentation of `std:🧵:available_parallelism`, as requested [here](https://github.com/rust-lang/rust/pull/89324#issuecomment-934343254).
## Changes
The following changes are made:
- We've removed prior mentions of "hardware threads" and instead centers the docs around "parallelism" as a resource available to a program.
- We now provide examples of when `available_parallelism` may return numbers that differ from the number of CPU cores in the host machine.
- We now mention that the amount of available parallelism may change over time.
- We make note of which platform components we don't take into account which more advanced users may want to take note of.
- The example has been updated, which should be a bit easier to use.
- We've added a docs alias to `num-cpus` which provides similar functionality to `available_parallelism`, and is one of the most popular crates on crates.io.
---
Thanks!
r? `@BurntSushi`
Rename `std:🧵:available_conccurrency` to `std:🧵:available_parallelism`
_Tracking issue: https://github.com/rust-lang/rust/issues/74479_
This PR renames `std:🧵:available_conccurrency` to `std:🧵:available_parallelism`.
## Rationale
The API was initially named `std:🧵:hardware_concurrency`, mirroring the [C++ API of the same name](https://en.cppreference.com/w/cpp/thread/thread/hardware_concurrency). We eventually decided to omit any reference to the word "hardware" after [this comment](https://github.com/rust-lang/rust/pull/74480#issuecomment-662045841). And so we ended up with `available_concurrency` instead.
---
For a talk I was preparing this week I was reading through ["Understanding and expressing scalable concurrency" (A. Turon, 2013)](http://aturon.github.io/academic/turon-thesis.pdf), and the following passage stood out to me (emphasis mine):
> __Concurrency is a system-structuring mechanism.__ An interactive system that deals with disparate asynchronous events is naturally structured by division into concurrent threads with disparate responsibilities. Doing so creates a better fit between problem and solution, and can also decrease the average latency of the system by preventing long-running computations from obstructing quicker ones.
> __Parallelism is a resource.__ A given machine provides a certain capacity for parallelism, i.e., a bound on the number of computations it can perform simultaneously. The goal is to maximize throughput by intelligently using this resource. For interactive systems, parallelism can decrease latency as well.
_Chapter 2.1: Concurrency is not Parallelism. Page 30._
---
_"Concurrency is a system-structuring mechanism. Parallelism is a resource."_ — It feels like this accurately captures the way we should be thinking about these APIs. What this API returns is not "the amount of concurrency available to the program" which is a property of the program, and thus even with just a single thread is effectively unbounded. But instead it returns "the amount of _parallelism_ available to the program", which is a resource hard-constrained by the machine's capacity (and can be further restricted by e.g. operating systems).
That's why I'd like to propose we rename this API from `available_concurrency` to `available_parallelism`. This still meets the criteria we previously established of not attempting to define what exactly we mean by "hardware", "threads", and other such words. Instead we only talk about "concurrency" as an abstract resource available to our program.
r? `@joshtriplett`
Previously the thread name would first be heap allocated and then
re-allocated to add a nul terminator. Now it will be heap allocated only
once with nul terminator added form the start.
- also clarifies how thread.join and detaching of threads works
- the previous prose implied that there is a relationship between a
spawning thread and the thread being spawned, and that "child" threads
couldn't outlive their parents unless detached, which is incorrect.
The old documentation suggested the use of yield_now for repeated
polling instead of discouraging it; it also made the false claim that
channels are implementing using yield_now. (They are not, except for
a corner case).
Due to the std/alloc split, it is not possible to make
`alloc::collections::TryReserveError::AllocError` non-exhaustive without
having an unstable, doc-hidden method to construct (which negates the
benefits from `#[non_exhaustive]`.
Rollup of 11 pull requests
Successful merges:
- #85054 (Revert SGX inline asm syntax)
- #85182 (Move `available_concurrency` implementation to `sys`)
- #86037 (Add `io::Cursor::{remaining, remaining_slice, is_empty}`)
- #86114 (Reopen#79692 (Format symbols under shared frames))
- #86297 (Allow to pass arguments to rustdoc-gui tool)
- #86334 (Resolve type aliases to the type they point to in intra-doc links)
- #86367 (Fix comment about rustc_inherit_overflow_checks in abs().)
- #86381 (Add regression test for issue #39161)
- #86387 (Remove `#[allow(unused_lifetimes)]` which is now unnecessary)
- #86398 (Add regression test for issue #54685)
- #86493 (Say "this enum variant takes"/"this struct takes" instead of "this function takes")
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
std: Attempt again to inline thread-local-init across crates
Issue #25088 has been part of `thread_local!` for quite some time now.
Historical attempts have been made to add `#[inline]` to `__getit`
in #43931, #50252, and #59720, but these attempts ended up not landing
at the time due to segfaults on Windows.
In the interim though with `const`-initialized thread locals AFAIK this
is the only remaining bug which is why you might want to use
`#[thread_local]` over `thread_local!`. As a result I figured it was
time to resubmit this and see how it fares on CI and if I can help
debugging any issues that crop up.
Closes#25088
Issue #25088 has been part of `thread_local!` for quite some time now.
Historical attempts have been made to add `#[inline]` to `__getit`
in #43931, #50252, and #59720, but these attempts ended up not landing
at the time due to segfaults on Windows.
In the interim though with `const`-initialized thread locals AFAIK this
is the only remaining bug which is why you might want to use
`#[thread_local]` over `thread_local!`. As a result I figured it was
time to resubmit this and see how it fares on CI and if I can help
debugging any issues that crop up.
Closes#25088
Replace all `fmt.pad` with `debug_struct`
This replaces any occurrence of:
- `f.pad("X")` with `f.debug_struct("X").finish()`
- `f.pad("X { .. }")` with `f.debug_struct("X").finish_non_exhaustive()`
This is in line with existing formatting code such as
1255053067/library/std/src/sync/mpsc/mod.rs (L1470-L1475)
This commit adds a variant of the `thread_local!` macro as a new
`thread_local_const_init!` macro which requires that the initialization
expression is constant (e.g. could be stuck into a `const` if so
desired). This form of thread local allows for a more efficient
implementation of `LocalKey::with` both if the value has a destructor
and if it doesn't. If the value doesn't have a destructor then `with`
should desugar to exactly as-if you use `#[thread_local]` given
sufficient inlining.
The purpose of this new form of thread locals is to precisely be
equivalent to `#[thread_local]` on platforms where possible for values
which fit the bill (those without destructors). This should help close
the gap in performance between `thread_local!`, which is safe, relative
to `#[thread_local]`, which is not easy to use in a portable fashion.
The existing documentation does not spell out whether `ThreadId`s are unique
during the lifetime of a thread or of a process. I had to examine the source
code to realise (pleasingly!) that they're unique for the lifetime of a process.
That seems worth documenting clearly, as it's a strong guarantee.
Examining the way `ThreadId`s are created also made me realise that the `as_u64`
method on `ThreadId` could be a trap for the unwary on those platforms where the
platform's notion of a thread identifier is also a 64 bit integer (particularly
if they happen to use a similar identifier scheme to `ThreadId`). I therefore
think it's worth being even clearer that there's no relationship between the
two.