linux/aarch64 Now() should be actually_monotonic()
While issues have been seen on arm64 platforms the Arm architecture requires
that the counter monotonically increases and that it must provide a uniform
view of system time (e.g. it must not be possible for a core to receive a
message from another core with a time stamp and observe time going backwards
(ARM DDI 0487G.b D11.1.2). While there have been a few 64bit SoCs that have
bugs (#49281, #56940) which cause time to not monotonically increase, these have
been fixed in the Linux kernel and we shouldn't penalize all Arm SoCs for those
who refuse to update their kernels:
SUN50I_ERRATUM_UNKNOWN1 - Allwinner A64 / Pine A64 - fixed in 5.1
FSL_ERRATUM_A008585 - Freescale LS2080A/LS1043A - fixed in 4.10
HISILICON_ERRATUM_161010101 - Hisilicon 1610 - fixed in 4.11
ARM64_ERRATUM_858921 - Cortex A73 - fixed in 4.12
255a3f3e18 std: Force `Instant::now()` to be monotonic added a Mutex to work around
this problem and a small test program using glommio shows the majority of time spent
acquiring and releasing this Mutex. 3914a7b0da tries to improve this, but actually
makes it worse on big systems as for 128b atomics a ldxp/stxp pair (and successful loop)
for v8.4 systems that don't support FEAT_LSE2 is required which is expensive as a lock
and because of how the load/store-exclusives scale on large Arm systems is both unfair
to threads and tends to go backwards in performance.
A small sample program using glommio improves by 70x on a 32 core Graviton2
system with this change.
Add `#[repr(i8)]` to `Ordering`
Followup to #89491 to allow `Ordering` to auto-derive `AsRepr` once the proposal to add `AsRepr` (#81642) lands.
cc ``@joshtriplett``
updating docs to mention usage of AtomicBool
Mouse mentioned we should point out that atomic bool is used by the std lib these days. ( https://github.com/m-ou-se/getrandom/pull/1 )
[fuchsia] Update process info struct
The fuchsia platform is in the process of softly transitioning over to
using a new value for ZX_INFO_PROCESS with a new corresponding struct.
This change migrates libstd.
See [fxrev.dev/510478](https://fxrev.dev/510478) and [fxbug.dev/30751](https://fxbug.dev/30751) for more detail.
Stabilize `unreachable_unchecked` as `const fn`
Closes#53188
This PR stabilizes `core::hint::unreachable_unchecked` as `const fn`. MIRI is able to detect when this method is called. Stabilization was delayed until `const_panic` was stabilized so as to avoid users calling this method in its place (thus resulting in runtime UB). With #89508, that is no longer an issue.
````@rustbot```` label +A-const-eval +A-const-fn +T-lang +S-blocked
(not sure why it's T-lang, but that's what the tracking issue is)
Add abstract namespace support for Unix domain sockets
Hello! The other day I wanted to mess around with UDS in Rust and found that abstract namespaces ([unix(7)](https://man7.org/linux/man-pages/man7/unix.7.html)) on Linux still needed development. I took the approach of adding `_addr` specific public functions to reduce conflicts.
Feature name: `unix_socket_abstract`
Tracking issue: #85410
Further context: #42048
## Non-platform specific additions
`UnixListener::bind_addr(&SocketAddr) -> Result<UnixListener>`
`UnixStream::connect_addr(&SocketAddr) -> Result<()>`
`UnixDatagram::bind_addr(&SocketAddr) -> Result<UnixDatagram>`
`UnixDatagram::connect_addr(&SocketAddr) -> Result<()>`
`UnixDatagram::send_to_addr(&self, &[u8], &SocketAddr) -> Result<usize>`
## Platform-specific (Linux) additions
`SocketAddr::from_abstract_namespace(&[u8]) -> SocketAddr`
`SockerAddr::as_abstract_namespace() -> Option<&[u8]>`
## Example
```rust
#![feature(unix_socket_abstract)]
use std::os::unix::net::{UnixListener, SocketAddr};
fn main() -> std::io::Result<()> {
let addr = SocketAddr::from_abstract_namespace(b"namespace")?; // Linux only
let listener = match UnixListener::bind_addr(&addr) {
Ok(sock) => sock,
Err(err) => {
println!("Couldn't bind: {:?}", err);
return Err(err);
}
};
Ok(())
}
```
## Further Details
The main inspiration for the implementation came from the [nix-rust](https://github.com/nix-rust/nix/blob/master/src/sys/socket/addr.rs#L558) crate but there are also other [historical](c4db0685b1) [attempts](https://github.com/tormol/uds/blob/master/src/addr.rs#L324) with similar approaches.
A comment I did have was with this change, we now allow a `SocketAddr` to be constructed explicitly rather than just used almost as a handle for the return of `peer_addr` and `local_addr`. We could consider adding other explicit constructors (e.g. `SocketAddr::from_pathname`, `SockerAddr::from_unnamed`).
Cheers!
Use BCryptGenRandom instead of RtlGenRandom on Windows.
This removes usage of RtlGenRandom on Windows, in favour of BCryptGenRandom.
BCryptGenRandom isn't available on XP, but we dropped XP support a while ago.
The fuchsia platform is in the process of softly transitioning over to
using a new value for ZX_INFO_PROCESS with a new corresponding struct.
This change migrates libstd.
See fxrev.dev/510478 and fxbug.dev/30751 for more detail.
Avoid allocations and copying in Vec::leak
The [`Vec::leak`] method (#62195) is currently implemented by calling `Vec::into_boxed_slice` and `Box::leak`. This shrinks the vector before leaking it, which potentially causes a reallocation and copies the vector's contents.
By avoiding the conversion to `Box`, we can instead leak the vector without any expensive operations, just by returning a slice reference and forgetting the `Vec`. Users who *want* to shrink the vector first can still do so by calling `shrink_to_fit` explicitly.
**Note:** This could break code that uses `Box::from_raw` to “un-leak” the slice returned by `Vec::leak`. However, the `Vec::leak` docs explicitly forbid this, so such code is already incorrect.
[`Vec::leak`]: https://doc.rust-lang.org/stable/std/vec/struct.Vec.html#method.leak
Optimize VecDeque::append
Optimize `VecDeque::append` to do unsafe copy rather than iterating through each element.
On my `Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz`, the benchmark shows 37% improvements:
```
Master:
custom-bench vec_deque_append 583164 ns/iter
custom-bench vec_deque_append 550040 ns/iter
Patched:
custom-bench vec_deque_append 349204 ns/iter
custom-bench vec_deque_append 368164 ns/iter
```
Additional notes on the context: this is the third attempt to implement a non-trivial version of `VecDeque::append`, the last two are reverted due to unsoundness or regression, see:
- https://github.com/rust-lang/rust/pull/52553, reverted in https://github.com/rust-lang/rust/pull/53571
- https://github.com/rust-lang/rust/pull/53564, reverted in https://github.com/rust-lang/rust/pull/54851
Both cases are covered by existing tests.
Signed-off-by: tabokie <xy.tao@outlook.com>
Fix ctrl-c causing reads of stdin to return empty on Windows.
Pressing ctrl+c (or ctrl+break) on Windows caused a blocking read of stdin to unblock and return empty, unlike other platforms which continue to block.
On ctrl-c, `ReadConsoleW` will return success, but also set `LastError` to `ERROR_OPERATION_ABORTED`.
This change detects this case, and re-tries the call to `ReadConsoleW`.
Fixes#89177. See issue for further details.
Tested on Windows 7 and Windows 10 with both MSVC and GNU toolchains
Add `const_eval_select` intrinsic
Adds an intrinsic that calls a given function when evaluated at compiler time, but generates a call to another function when called at runtime.
See https://github.com/rust-lang/const-eval/issues/7 for previous discussion.
r? `@oli-obk.`
Improve `std:🧵:available_parallelism` docs
_Tracking issue: https://github.com/rust-lang/rust/issues/74479_
This PR reworks the documentation of `std:🧵:available_parallelism`, as requested [here](https://github.com/rust-lang/rust/pull/89324#issuecomment-934343254).
## Changes
The following changes are made:
- We've removed prior mentions of "hardware threads" and instead centers the docs around "parallelism" as a resource available to a program.
- We now provide examples of when `available_parallelism` may return numbers that differ from the number of CPU cores in the host machine.
- We now mention that the amount of available parallelism may change over time.
- We make note of which platform components we don't take into account which more advanced users may want to take note of.
- The example has been updated, which should be a bit easier to use.
- We've added a docs alias to `num-cpus` which provides similar functionality to `available_parallelism`, and is one of the most popular crates on crates.io.
---
Thanks!
r? `@BurntSushi`
fix minor spelling error in Poll::ready docs
Fixes minor spelling error in the proposed `Poll::ready` docs. Not that my opinion matters, but +1 on the original PR (#89651), it reads much nicer to me than the `ready!` macro.
Add #[must_use] to is_condition tests
I threw in `std::path::Path::has_root` for funsies.
A continuation of #89718.
Parent issue: #89692
r? ```@joshtriplett```
Add #[must_use] to non-mutating verb methods
These are methods that could be misconstrued to mutate their input, similar to #89694. I gave each one a different custom message.
I wrote that `upgrade` and `downgrade` don't modify the input pointers. Logically they don't, but technically they do...
Parent issue: #89692
r? ```@joshtriplett```
Add #[must_use] to From::from and Into::into
Risk of churn: **High**
Magic 8-Ball says: **Outlook not so good**
I figured I'd put this out there. If we don't do it now maybe we save it for a rainy day.
Parent issue: #89692
r? `@joshtriplett`
Speedup int log10 branchless
This is achieved with a branchless bit-twiddling implementation of the case x < 100_000, and using this as building block.
Benchmark on an Intel i7-8700K (Coffee Lake):
```
name old ns/iter new ns/iter diff ns/iter diff % speedup
num::int_log::u8_log10_predictable 165 169 4 2.42% x 0.98
num::int_log::u8_log10_random 438 423 -15 -3.42% x 1.04
num::int_log::u8_log10_random_small 438 423 -15 -3.42% x 1.04
num::int_log::u16_log10_predictable 633 417 -216 -34.12% x 1.52
num::int_log::u16_log10_random 908 471 -437 -48.13% x 1.93
num::int_log::u16_log10_random_small 945 471 -474 -50.16% x 2.01
num::int_log::u32_log10_predictable 1,496 1,340 -156 -10.43% x 1.12
num::int_log::u32_log10_random 1,076 873 -203 -18.87% x 1.23
num::int_log::u32_log10_random_small 1,145 874 -271 -23.67% x 1.31
num::int_log::u64_log10_predictable 4,005 3,171 -834 -20.82% x 1.26
num::int_log::u64_log10_random 1,247 1,021 -226 -18.12% x 1.22
num::int_log::u64_log10_random_small 1,265 921 -344 -27.19% x 1.37
num::int_log::u128_log10_predictable 39,667 39,579 -88 -0.22% x 1.00
num::int_log::u128_log10_random 6,456 6,696 240 3.72% x 0.96
num::int_log::u128_log10_random_small 4,108 3,903 -205 -4.99% x 1.05
```
Benchmark on an M1 Mac Mini:
```
name old ns/iter new ns/iter diff ns/iter diff % speedup
num::int_log::u8_log10_predictable 143 130 -13 -9.09% x 1.10
num::int_log::u8_log10_random 375 325 -50 -13.33% x 1.15
num::int_log::u8_log10_random_small 376 325 -51 -13.56% x 1.16
num::int_log::u16_log10_predictable 500 322 -178 -35.60% x 1.55
num::int_log::u16_log10_random 794 405 -389 -48.99% x 1.96
num::int_log::u16_log10_random_small 1,035 405 -630 -60.87% x 2.56
num::int_log::u32_log10_predictable 1,144 894 -250 -21.85% x 1.28
num::int_log::u32_log10_random 832 786 -46 -5.53% x 1.06
num::int_log::u32_log10_random_small 832 787 -45 -5.41% x 1.06
num::int_log::u64_log10_predictable 2,681 2,057 -624 -23.27% x 1.30
num::int_log::u64_log10_random 1,015 806 -209 -20.59% x 1.26
num::int_log::u64_log10_random_small 1,004 795 -209 -20.82% x 1.26
num::int_log::u128_log10_predictable 56,825 56,526 -299 -0.53% x 1.01
num::int_log::u128_log10_random 9,056 8,861 -195 -2.15% x 1.02
num::int_log::u128_log10_random_small 1,528 1,527 -1 -0.07% x 1.00
```
The 128 bit case remains ridiculously slow because llvm fails to optimize division by a constant 128-bit value to multiplications. This could be worked around but it seems preferable to fix this in llvm.
From u32 up, table lookup (like suggested [here](https://github.com/rust-lang/rust/issues/70887#issuecomment-881099813)) is still faster, but requires a hardware `leading_zeros` to be viable, and might clog up the cache.