Commit Graph

7178 Commits

Author SHA1 Message Date
Ralf Jung
194baa820d simd_shuffle intrinsic: allow argument to be passed as vector (not just as array) 2024-08-13 07:51:17 +02:00
Matthias Krüger
522d43673a
Rollup merge of #129017 - its-the-shrimp:core_fmt_from_fn, r=Noratrieb
Replace `std::fmt:FormatterFn` with `std::fmt::from_fn`

Modelled after the suggestion made in [this](https://github.com/rust-lang/rust/issues/117729#issuecomment-1837628559) comment, this should bring this functionality in line with the existing `array::from_fn` & `iter::from_fn`
2024-08-12 23:10:52 +02:00
schvv31n
027b19fa9b std::fmt::FormatterFn -> std::fmt::FromFn 2024-08-12 18:33:30 +01:00
Guillaume Gomez
095ca33bb6
Rollup merge of #128149 - RalfJung:nontemporal_store, r=jieyouxu,Amanieu,Jubilee
nontemporal_store: make sure that the intrinsic is truly just a hint

The `!nontemporal` flag for stores in LLVM *sounds* like it is just a hint, but actually, it is not -- at least on x86, non-temporal stores need very special treatment by the programmer or else the Rust memory model breaks down. LLVM still treats these stores as-if they were normal stores for optimizations, which is [highly dubious](https://github.com/llvm/llvm-project/issues/64521). Let's avoid all that dubiousness by making our own non-temporal stores be truly just a hint, which is possible on some targets (e.g. ARM). On all other targets, non-temporal stores become regular stores.

~~Blocked on https://github.com/rust-lang/stdarch/pull/1541 propagating to the rustc repo, to make sure the `_mm_stream` intrinsics are unaffected by this change.~~

Fixes https://github.com/rust-lang/rust/issues/114582
Cc `@Amanieu` `@workingjubilee`
2024-08-12 17:09:14 +02:00
bors
1d8f135b20 Auto merge of #128862 - cblh:fix/128855, r=scottmcm
fix:  #128855 Ensure `Guard`'s `drop` method is removed at `opt-level=s` for `…

fix: #128855

…Copy` types

Added `#[inline]` to the `drop` method in the `Guard` implementation to ensure that the method is removed by the compiler at optimization level `opt-level=s` for `Copy` types. This change aims to align the method's behavior with optimization expectations and ensure it does not affect performance.

r​? `@scottmcm`
2024-08-12 05:22:03 +00:00
bors
13f8a57cfb Auto merge of #126793 - saethlin:mono-rawvec, r=scottmcm
Apply "polymorphization at home" to RawVec

The idea here is to move all the logic in RawVec into functions with explicit size and alignment parameters. This should eliminate all the fussing about how tweaking RawVec code produces large swings in compile times.

This uncovered https://github.com/rust-lang/rust-clippy/issues/12979, so I've modified the relevant test in a way that tries to preserve the spirit of the test without tripping the ICE.
2024-08-12 01:47:06 +00:00
Matthias Krüger
2c88eb9805
Rollup merge of #128882 - RalfJung:local-waker-will-wake, r=cuviper
make LocalWaker::will_wake consistent with Waker::will_wake

This mirrors https://github.com/rust-lang/rust/pull/119863 for `LocalWaker`. Looks like that got missed in the initial `LocalWaker` PR (https://github.com/rust-lang/rust/pull/118960).
2024-08-11 07:51:52 +02:00
Matthias Krüger
e8f6819db7
Rollup merge of #120314 - mina86:i, r=Mark-Simulacrum
core: optimise Debug impl for ascii::Char

Rather than writing character at a time, optimise Debug implementation
for core::ascii::Char such that it writes the entire representation
with a single write_str call.

With that, add tests for Display and Debug.

Issue: https://github.com/rust-lang/rust/issues/110998
2024-08-11 07:51:49 +02:00
bors
04ba50e823 Auto merge of #128927 - GuillaumeGomez:rollup-ei2lr0f, r=GuillaumeGomez
Rollup of 8 pull requests

Successful merges:

 - #128273 (Improve `Ord` violation help)
 - #128807 (run-make: explaing why fmt-write-bloat is ignore-windows)
 - #128903 (rustdoc-json-types `Discriminant`: fix typo)
 - #128905 (gitignore: Add Zed and Helix editors)
 - #128908 (diagnostics: do not warn when a lifetime bound infers itself)
 - #128909 (Fix dump-ice-to-disk for RUSTC_ICE=0 users)
 - #128910 (Differentiate between methods and associated functions in diagnostics)
 - #128923 ([rustdoc] Stop showing impl items for negative impls)

r? `@ghost`
`@rustbot` modify labels: rollup
2024-08-10 15:13:38 +00:00
Guillaume Gomez
65875b2f5a
Rollup merge of #128273 - Voultapher:improve-ord-violation-help, r=workingjubilee
Improve `Ord` violation help

Recent experience in #128083 showed that the panic message when an Ord violation is detected by the new sort implementations can be confusing. So this PR aims to improve it, together with minor bug fixes in the doc comments for sort*, sort_unstable* and select_nth_unstable*.

Is it possible to get these changes into the 1.81 release? It doesn't change behavior and would greatly help when users encounter this panic for the first time, which they may after upgrading to 1.81.

Tagging `@orlp`
2024-08-10 16:23:51 +02:00
Nadrieril
c256de2253 Update std and compiler 2024-08-10 12:07:17 +02:00
Ben Kimock
d6c0ebef50 Polymorphize RawVec 2024-08-09 20:06:26 -04:00
Michal Nazarewicz
7d1de7f994 core: optimise Debug impl for ascii::Char
Rather than writing character at a time, optimise Debug implementation
for core::ascii::Char such that it writes the entire representation as
with a single write_str call.

With that, add tests for Display and Debug implementations.
2024-08-09 22:50:57 +02:00
Ralf Jung
ae09340350 make LocalWaker::will_wake consistent with Waker::will_wake 2024-08-09 18:05:57 +02:00
Lukas Bergdoll
1be60b5d2b Fix linkchecker issue 2024-08-09 15:05:37 +02:00
burlinchen
bca0c5f2a9 fix: Ensure Guard's drop method is removed at opt-level=s for Copy types
Added `#[inline]` to the `drop` method in the `Guard` implementation to ensure that the method is removed by the compiler at optimization level `opt-level=s` for `Copy` types. This change aims to align the method's behavior with optimization expectations and ensure it does not affect performance.
2024-08-09 11:10:30 +08:00
Matthias Krüger
127fbc8481
Rollup merge of #128749 - tgross35:float-inline, r=scottmcm
Mark `{f32,f64}::{next_up,next_down,midpoint}` inline

Most float functions are marked `#[inline]` so any float symbols used by these functions only need to be provided if the function itself is used. RFL recently noticed that `next_up`, `next_down`, and `midpoint` for `f32` and `f64` are not inline, which causes linker errors when building with certain configurations <https://lore.kernel.org/all/20240806150619.192882-1-ojeda@kernel.org/>.

Add the missing attributes so the symbols should no longer be required.
2024-08-08 18:57:02 +02:00
Matthias Krüger
3a9dd829d0
Rollup merge of #128306 - WiktorPrzetacznik:WiktorPrzetacznik-nonnull-alignoffset-update, r=Amanieu
Update NonNull::align_offset quarantees

This PR proposes to update [`NonNull::align_offset`](https://doc.rust-lang.org/stable/std/ptr/struct.NonNull.html#method.align_offset) guarantees, which should to be matched with [`ptr::align_offset`](https://doc.rust-lang.org/stable/std/primitive.pointer.html#method.align_offset-1)
(as `NonNull::align_offset` delegates to `ptr::align_offset`).

[PR #121201](https://github.com/rust-lang/rust/pull/121201) updated only `ptr::align_offset` docs.
2024-08-08 18:57:00 +02:00
ltdk
0257f42089 Add tracking issue to core-pattern-type 2024-08-07 20:43:05 -04:00
Trevor Gross
6b3feb49c6 Mark {f32,f64}::{next_up,next_down,midpoint} inline
Most float functions are marked `#[inline]` so any float symbols used by
these functions only need to be provided if the function itself is used.
RFL recently noticed that `next_up`, `next_down`, and `midpoint` for
`f32` and `f64` are not inline, which causes linker errors when building
with certain configurations [1].

Add the missing attributes so the symbols should no longer be required.

Cc: Gary Guo <gary@garyguo.net>
Reported-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/all/20240806150619.192882-1-ojeda@kernel.org/ [1]
2024-08-07 03:24:55 -05:00
Trevor Gross
b3bfd66627
Rollup merge of #128417 - tgross35:f16-f128-math, r=dtolnay
Add `f16` and `f128` math functions

This adds intrinsics and math functions for `f16` and `f128` floating point types. Support is quite limited and some things are broken so tests don't run on many platforms, but this provides a starting point.
2024-08-06 22:17:32 -05:00
Matthias Krüger
16b251be10
Rollup merge of #125048 - dingxiangfei2009:stable-deref, r=amanieu
PinCoerceUnsized trait into core

cc ``@Darksonn`` ``@wedsonaf`` ``@ojeda``

This is a PR to introduce a `PinCoerceUnsized` trait in order to make trait impls generated by the proc-macro `#[derive(SmartPointer)]`, proposed by [RFC](e17e19ac7a/text/3621-derive-smart-pointer.md (pincoerceunsized-1)), sound. There you may find explanation, justification and discussion about the alternatives.

Note that we do not seek stabilization of this `PinCoerceUnsized` trait in the near future. The stabilisation of this trait does not block the eventual stabilization process of the `#[derive(SmartPointer)]` macro. Ideally, use of `DerefPure` is more preferrable except this will actually constitute a breaking change. `PinCoerceUnsized` emerges as a solution to the said soundness hole while avoiding the breaking change. More details on the `DerefPure` option have been described in this [section](e17e19ac7a/text/3621-derive-smart-pointer.md (derefpure)) of the RFC linked above.

Earlier discussion can be found in this [Zulip stream](https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/Pin.20and.20soundness.20of.20unsizing.20coercions) and [rust-for-linux thread](https://rust-lang.zulipchat.com/#narrow/stream/425075-rust-for-linux/topic/.23.5Bderive.28SmartPointer.29.5D.20and.20pin.20unsoundness.20rfc.233621).

try-job: dist-various-2
2024-08-07 00:34:11 +02:00
Ralf Jung
212417b87f custom MIR: add support for tail calls 2024-08-05 18:23:14 +02:00
Ralf Jung
28e0907111 nontemporal_store: make sure that the intrinsic is truly just a hint 2024-08-05 10:57:14 +02:00
Matthias Krüger
74df517b90
Rollup merge of #128619 - glandium:last_chunk, r=scottmcm
Correct the const stabilization of `<[T]>::last_chunk`

`<[T]>::first_chunk` became const stable in 1.77, but `<[T]>::last_chunk` was left out. This was fixed in 3488679768, which reached stable in 1.80, making `<[T]>::last_chunk` const stable as of that version, but it is documented as being const stable as 1.77. While this is what should have happened, the documentation should reflect what actually did happen.
2024-08-05 08:22:24 +02:00
Matthias Krüger
1e951b70c7
Rollup merge of #128609 - swenson:smaller-faster-dragon, r=Amanieu
Remove unnecessary constants from flt2dec dragon

The "dragon" `flt2dec` algorithm uses multi-precision multiplication by (sometimes large) powers of 10. It has precomputed some values to help with these calculations.

BUT:

* There is no need to store powers of 10 and 2 * powers of 10: it is trivial to compute the second from the first.
* We can save a chunk of memory by storing powers of 5 instead of powers of 10 for the large powers (and just shifting as appropriate).
* This also slightly speeds up the routines (by ~1-3%) since the intermediate products are smaller and the shift is cheap.

In this PR, we remove the unnecessary constants and do the necessary adjustments.

Relevant benchmarks before (on my Threadripper 3970X, x86_64-unknown-linux-gnu):

```
num::flt2dec::bench_big_shortest                      137.92/iter   +/- 2.24
num::flt2dec::strategy:🐉:bench_big_exact_12   2135.28/iter  +/- 38.90
num::flt2dec::strategy:🐉:bench_big_exact_3     904.95/iter  +/- 10.58
num::flt2dec::strategy:🐉:bench_big_exact_inf 47230.33/iter +/- 320.84
num::flt2dec::strategy:🐉:bench_big_shortest   3915.05/iter  +/- 51.37
```

and after:

```
num::flt2dec::bench_big_shortest                      137.40/iter   +/- 2.03
num::flt2dec::strategy:🐉:bench_big_exact_12   2101.10/iter  +/- 25.63
num::flt2dec::strategy:🐉:bench_big_exact_3     873.86/iter   +/- 4.20
num::flt2dec::strategy:🐉:bench_big_exact_inf 47468.19/iter +/- 374.45
num::flt2dec::strategy:🐉:bench_big_shortest   3877.01/iter  +/- 45.74
```
2024-08-05 08:22:22 +02:00
Mike Hommey
70ab51f988 Correct the const stabilization of <[T]>::last_chunk
`<[T]>::first_chunk` became const stable in 1.77, but `<[T]>::last_chunk` was
left out. This was fixed in 3488679768, which reached stable in 1.80,
making `<[T]>::last_chunk` const stable as of that version, but it is
documented as being const stable as 1.77. While this is what should have
happened, the documentation should reflect what actually did happen.
2024-08-05 05:01:39 +09:00
Matthias Krüger
c8a33f7e34
Rollup merge of #128526 - tshepang:patch-1, r=Amanieu
time.rs: remove "Basic usage text"

Only one example is given (for each method)
2024-08-04 11:32:34 +02:00
bors
b389b0ab72 Auto merge of #128466 - sayantn:stdarch-update, r=tgross35
Update the stdarch submodule

cc `@tgross35` `@Amanieu`
r? `@tgross35`

try-job: dist-various-2
2024-08-04 02:11:27 +00:00
sayantn
2cde11f2d1 Chore: add x86_amx_intrinsics feature flag to core/lib.rs and remove issue-120720-reduce-nan.rs 2024-08-04 03:08:18 +05:30
Matthias Krüger
53a56190af
Rollup merge of #128530 - scottmcm:repeat-n-unchecked, r=joboet
Implement `UncheckedIterator` directly for `RepeatN`

This just pulls the code out of `next` into `next_unchecked`, rather than making the `Some` and `unwrap_unchecked`ing it.

And while I was touching it, I added a codegen test that `array::repeat` for something that's just `Clone`, not `Copy`, still ends up optimizing to the same thing as `[x; n]`: <https://rust.godbolt.org/z/YY3a5ajMW>.
2024-08-03 20:51:52 +02:00
Christopher Swenson
36a805939e Remove unnecessary constants from flt2dec dragon
The "dragon" `flt2dec` algorithm uses multi-precision multiplication by
(sometimes large) powers of 10. It has precomputed some values to help
with these calculations.

BUT:

* There is no need to store powers of 10 and 2 * powers of 10: it is
  trivial to compute the second from the first.
* We can save a chunk of memory by storing powers of 5 instead of powers
  of 10 for the large powers (and just shifting by 2 as appropriate).
* This also slightly speeds up the routines (by ~1-3%) since the
  intermediate products are smaller and the shift is cheap.

In this PR, we remove the unnecessary constants and do the necessary
adjustments.

Relevant benchmarks before (on my Threadripper 3970X, x86_64-unknown-linux-gnu):

```
num::flt2dec::bench_big_shortest                      137.92/iter   +/- 2.24
num::flt2dec::strategy:🐉:bench_big_exact_12   2135.28/iter  +/- 38.90
num::flt2dec::strategy:🐉:bench_big_exact_3     904.95/iter  +/- 10.58
num::flt2dec::strategy:🐉:bench_big_exact_inf 47230.33/iter +/- 320.84
num::flt2dec::strategy:🐉:bench_big_shortest   3915.05/iter  +/- 51.37
```

and after:

```
num::flt2dec::bench_big_shortest                      137.40/iter   +/- 2.03
num::flt2dec::strategy:🐉:bench_big_exact_12   2101.10/iter  +/- 25.63
num::flt2dec::strategy:🐉:bench_big_exact_3     873.86/iter   +/- 4.20
num::flt2dec::strategy:🐉:bench_big_exact_inf 47468.19/iter +/- 374.45
num::flt2dec::strategy:🐉:bench_big_shortest   3877.01/iter  +/- 45.74
```
2024-08-03 08:49:38 -07:00
Lukas Bergdoll
613155c96a Apply review comments to PartialOrd section 2024-08-03 15:10:27 +02:00
bors
1f47624f9a Auto merge of #128404 - compiler-errors:revert-dead-code-changes, r=pnkfelix
Revert recent changes to dead code analysis

This is a revert to recent changes to dead code analysis, namely:
* efdf219 Rollup merge of #128104 - mu001999-contrib:fix/128053, r=petrochenkov
* a70dc297a8 Rollup merge of #127017 - mu001999-contrib:dead/enhance, r=pnkfelix
* 31fe9628cf Rollup merge of #127107 - mu001999-contrib:dead/enhance-2, r=pnkfelix
* 2724aeaaeb Rollup merge of #126618 - mu001999-contrib:dead/enhance, r=pnkfelix
* 977c5fd419 Rollup merge of #126315 - mu001999-contrib:fix/126289, r=petrochenkov
* 13314df21b Rollup merge of #125572 - mu001999-contrib:dead/enhance, r=pnkfelix

There is an additional change stacked on top, which suppresses false-negatives that were masked by this work. I believe the functions that are touched in that code are legitimately unused functions and the types are not reachable since this `AnonPipe` type is not publically reachable -- please correct me if I'm wrong cc `@NobodyXu` who added these in ##127153.

Some of these reverts (#126315 and #126618) are only included because it makes the revert apply cleanly, and I think these changes were only done to fix follow-ups from the other PRs?

I apologize for the size of the PR and the churn that it has on the codebase (and for reverting `@mu001999's` work here), but I'm putting this PR up because I am concerned that we're making ad-hoc changes to fix bugs that are fallout of these PRs, and I'd like to see these changes reimplemented in a way that's more separable from the existing dead code pass. I am happy to review any code to reapply these changes in a more separable way.

cc `@mu001999`
r? `@pnkfelix`

Fixes #128272
Fixes #126169
2024-08-03 13:04:30 +00:00
Michael Goulet
5f5b4ee128 Revert "Rollup merge of #127107 - mu001999-contrib:dead/enhance-2, r=pnkfelix"
This reverts commit 31fe9628cf, reversing
changes made to f20307851e.
2024-08-03 07:57:31 -04:00
Matthias Krüger
8aa18290a4
Rollup merge of #126704 - sayantn:sha, r=Amanieu
Added SHA512, SM3, SM4 target-features and `sha512_sm_x86` feature gate

This is an effort towards #126624. This adds support for these 3 target-features and introduces the feature flag `sha512_sm_x86`, which would gate these target-features and the yet-to-be-implemented detection and intrinsics in stdarch.
2024-08-03 11:17:41 +02:00
bors
19326022d2 Auto merge of #128254 - Amanieu:orig-binary-search, r=tgross35
Rewrite binary search implementation

This PR builds on top of #128250, which should be merged first.

This restores the original binary search implementation from #45333 which has the nice property of having a loop count that only depends on the size of the slice. This, along with explicit conditional moves from #128250, means that the entire binary search loop can be perfectly predicted by the branch predictor.

Additionally, LLVM is able to unroll the loop when the slice length is known at compile-time. This results in a very compact code sequence of 3-4 instructions per binary search step and zero branches.

Fixes #53823
Fixes #115271
2024-08-02 08:20:35 +00:00
Scott McMurray
77ca30f195 Implement UncheckedIterator directly for RepeatN 2024-08-01 21:58:34 -07:00
Matthias Krüger
67fcb58347
Rollup merge of #128453 - RalfJung:raw_eq, r=saethlin
raw_eq: using it on bytes with provenance is not UB (outside const-eval)

The current behavior of raw_eq violates provenance monotonicity. See https://github.com/rust-lang/rust/pull/124921 for an explanation of provenance monotonicity. It is violated in raw_eq because comparing bytes without provenance is well-defined, but adding provenance makes the operation UB.

So remove the no-provenance requirement from raw_eq. However, the requirement stays in-place for compile-time invocations of raw_eq, that indeed cannot deal with provenance.

Cc `@rust-lang/opsem`
2024-08-02 06:43:43 +02:00
Tshepang Mbambo
af79a6396c
time.rs: remove "Basic usage text"
Only one example is given (for each method)
2024-08-02 05:16:01 +02:00
sayantn
41b017ec99 Add the sha512, sm3 and sm4 target features
Add the feature in `core/lib.rs`
2024-08-02 02:29:15 +05:30
Trevor Gross
8e2ca0c9d5 Add a disclaimer about x86 f128 math functions
Due to a LLVM bug, `f128` math functions link successfully but LLVM
chooses the wrong symbols (`long double` symbols rather than those for
binary128).

Since this is a notable problem that may surprise a number of users, add
a note about it.

Link: https://github.com/llvm/llvm-project/issues/44744
2024-08-01 15:38:53 -04:00
Trevor Gross
43836421f8 Update comments for {f16, f32, f64, f128}::midpoint
Clarify what makes some operations not safe, and correct comment in the
default branch ("not safe" -> "safe").
2024-08-01 15:38:53 -04:00
Trevor Gross
e18036c769 Add core functions for f16 and f128 that require math routines
`min`, `max`, and similar functions require external math routines. Add
these under the same gates as `std` math functions (`reliable_f16_math`
and `reliable_f128_math`).
2024-08-01 15:38:53 -04:00
Trevor Gross
82b40c4d8e Add math intrinsics for f16 and f128
These already exist in the compiler. Expose them in core so we can add
their library functions.
2024-08-01 15:36:15 -04:00
Matthias Krüger
ca73b8b7a5
Rollup merge of #128497 - Bryanskiy:fix-dropck-doc, r=lcnr
fix dropck documentation for `[T;0]` special-case

fixes https://github.com/rust-lang/rust/issues/110288.

r? ``@lcnr``
2024-08-01 18:43:41 +02:00
Lukas Bergdoll
eae7a186b2 Hide internal sort module 2024-08-01 17:37:07 +02:00
Bryanskiy
c8a3cafc0f fix dropck documentation for [T;0] special-case 2024-08-01 17:44:14 +03:00
Ralf Jung
f97aba2271 raw_eq: using it on bytes with provenance is not UB (outside const-eval) 2024-07-31 20:26:20 +02:00
Lukas Bergdoll
afc404fdfc Apply review comments
- Use if the implementation of [`Ord`] for `T`
  language
- Link to total order wiki page
- Rework total order help and examples
- Improve language to be more precise and less
  prone to misunderstandings.
- Fix usage of `sort_unstable_by` in `sort_by`
  example
- Fix missing author mention
- Use more consistent example input for sort
- Use more idiomatic assert_eq! in examples
- Use more natural "comparison function" language
  instead of "comparator function"
2024-07-31 11:36:57 +02:00