Rollup merge of #88651 - AGSaidi:monotonize-inner-64b-aarch64, r=dtolnay

Use the 64b inner:monotonize() implementation not the 128b one for aarch64 aarch64 prior to v8.4 (FEAT_LSE2) doesn't have an instruction that guarantees untorn 128b reads except for completing a 128b load/store exclusive pair (ldxp/stxp) or compare-and-swap (casp) successfully. The requirement to complete a 128b read+write atomic is actually more expensive and more unfair than the previous implementation of monotonize() which used a Mutex on aarch64, especially at large core counts. For aarch64 switch to the 64b atomic implementation which is about 13x faster for a benchmark that involves many calls to Instant::now().
2025-04-28 02:57:37 +00:00 · 2021-10-04 23:56:17 -07:00 · 2021-10-04 23:56:17 -07:00 · dd223d5c6d
commit dd223d5c6d
parent 7a09755148 ce450f893d
1 changed files with 2 additions and 2 deletions
--- a/library/std/src/time/monotonic.rs
+++ b/library/std/src/time/monotonic.rs
@ -5,7 +5,7 @@ pub(super) fn monotonize(raw: time::Instant) -> time::Instant {
    inner::monotonize(raw)
 }

-#[cfg(all(target_has_atomic = "64", not(target_has_atomic = "128")))]
+#[cfg(any(all(target_has_atomic = "64", not(target_has_atomic = "128")), target_arch = "aarch64"))]
 pub mod inner {
    use crate::sync::atomic::AtomicU64;
    use crate::sync::atomic::Ordering::*;
@ -71,7 +71,7 @@ pub mod inner {
    }
 }

-#[cfg(target_has_atomic = "128")]
+#[cfg(all(target_has_atomic = "128", not(target_arch = "aarch64")))]
 pub mod inner {
    use crate::sync::atomic::AtomicU128;
    use crate::sync::atomic::Ordering::*;