Commit Graph

2684 Commits

Author SHA1 Message Date
Zachary S
93e3c00670 Remove non-focused memory leaks in alloc doctests for Miri. 2024-07-06 22:24:00 -05:00
Jacob Pratt
db592253a6
Rollup merge of #123588 - tgross35:stabilize-assert_unchecked, r=dtolnay
Stabilize `hint::assert_unchecked`

Make the following API stable, including const:

```rust
// core::hint, std::hint

pub const unsafe fn assert_unchecked(p: bool);
```

This PR also reworks some of the documentation and adds an example.

Tracking issue: https://github.com/rust-lang/rust/issues/119131
FCP: https://github.com/rust-lang/rust/issues/119131#issuecomment-1906394087. The docs update should resolve the remaining concern.
2024-07-03 03:03:13 -04:00
Scott McMurray
23c8ed14c9 Avoid MIR bloat in inlining
In 126578 we ended up with more binary size increases than expected.

This change attempts to avoid inlining large things into small things, to avoid that kind of increase, in cases when top-down inlining will still be able to do that inlining later.
2024-07-01 05:17:13 -07:00
bors
b8d7dd8d69 Auto merge of #127026 - Urgau:cleanup-bootstrap-check-cfg, r=Kobzol
Cleanup bootstrap check-cfg

This PR cleanup many custom `check-cfg` in bootstrap that have been accumulated over the years.

As well as updating some outdated comments.
2024-06-30 22:27:29 +00:00
Matthias Krüger
2c228260dc
Rollup merge of #126970 - DaniPopes:simplify-str-clone_into, r=cuviper
Simplify `str::clone_into`

Removes an `unsafe` in favor of just using `String` methods.
2024-06-28 08:34:09 +02:00
Urgau
f026e0bfc1 Cleanup bootstrap check-cfg 2024-06-27 11:30:03 +02:00
Jacob Pratt
d3debc0037
Rollup merge of #126929 - nnethercote:rm-__rust_force_expr, r=oli-obk
Remove `__rust_force_expr`.

This was added (with a different name) to improve an error message. It is no longer needed -- removing it changes the error message, but overall I think the new message is no worse:
- the mention of `#` in the first line is a little worse,
- but the extra context makes it very clear what the problem is, perhaps even clearer than the old message,
- and the removal of the note about the `expr` fragment (an internal detail of `__rust_force_expr`) is an improvement.

Overall I think the error is quite clear and still far better than the old message that prompted #61933, which didn't even mention patterns.

The motivation for this is #124141, which will cause pasted metavariables to be tokenized and reparsed instead of the AST node being cached. This change in behaviour occasionally has a non-zero perf cost, and `__rust_force_expr` causes the tokenize/reparse step to occur twice. Removing `__rust_force_expr` greatly reduces the extra overhead for the `deep-vector` benchmark.

r? ```@oli-obk```
2024-06-27 02:06:19 -04:00
DaniPopes
d5ff4f4f65
Simplify str::clone_into
Removes an `unsafe` in favor of just using `String` methods.
2024-06-25 22:34:41 +02:00
Matthias Krüger
58bbade921
Rollup merge of #126302 - mu001999-contrib:ignore/default, r=michaelwoerister
Detect unused structs which derived Default

<!--
If this PR is related to an unstable feature or an otherwise tracked effort,
please link to the relevant tracking issue here. If you don't know of a related
tracking issue or there are none, feel free to ignore this.

This PR will get automatically assigned to a reviewer. In case you would like
a specific user to review your work, you can assign it to them by using

    r​? <reviewer name>
-->

Fixes #98871
2024-06-25 21:33:41 +02:00
mu001999
6997b6876d Detect unused structs which derived Default 2024-06-25 23:29:44 +08:00
Nicholas Nethercote
9828e960ab Remove __rust_force_expr.
This was added (with a different name) to improve an error message. It
is no longer needed -- removing it changes the error message, but overall
I think the new message is no worse:
- the mention of `#` in the first line is a little worse,
- but the extra context makes it very clear what the problem is, perhaps
  even clearer than the old message,
- and the removal of the note about the `expr` fragment (an internal
  detail of `__rust_force_expr`) is an improvement.

Overall I think the error is quite clear and still far better than the
old message that prompted #61933, which didn't even mention patterns.

The motivation for this is #124141, which will cause pasted
metavariables to be tokenized and reparsed instead of the AST node being
cached. This change in behaviour occasionally has a non-zero perf cost,
and `__rust_force_expr` causes the tokenize/reparse step to occur twice.
Removing `__rust_force_expr` greatly reduces the extra overhead for the
`deep-vector` benchmark.
2024-06-25 14:35:09 +10:00
Kevin Reid
13fca73f49 Replace MaybeUninit::uninit_array() with array repeat expression.
This is possible now that inline const blocks are stable; the idea was
even mentioned as an alternative when `uninit_array()` was added:
<https://github.com/rust-lang/rust/pull/65580#issuecomment-544200681>

> if it’s stabilized soon enough maybe it’s not worth having a
> standard library method that will be replaceable with
> `let buffer = [MaybeUninit::<T>::uninit(); $N];`

Const array repetition and inline const blocks are now stable (in the
next release), so that circumstance has come to pass, and we no longer
have reason to want `uninit_array()` other than convenience. Therefore,
let’s evaluate the inconvenience by not using `uninit_array()` in
the standard library, before potentially deleting it entirely.
2024-06-24 10:23:50 -07:00
bors
a0f01c3c10 Auto merge of #126838 - matthiaskrgr:rollup-qkop22o, r=matthiaskrgr
Rollup of 3 pull requests

Successful merges:

 - #126140 (Rename `std::fs::try_exists` to  `std::fs::exists` and stabilize fs_try_exists)
 - #126318 (Add a `x perf` command for integrating bootstrap with `rustc-perf`)
 - #126552 (Remove use of const traits (and `feature(effects)`) from stdlib)

r? `@ghost`
`@rustbot` modify labels: rollup
2024-06-22 18:48:10 +00:00
bors
f944afe380 Auto merge of #116113 - kpreid:arcmut, r=dtolnay
Generalize `{Rc,Arc}::make_mut()` to unsized types.

* `{Rc,Arc}::make_mut()` now accept any type implementing the new unstable trait `core::clone::CloneToUninit`.
* `CloneToUninit` is implemented for `T: Clone` and for `[T] where T: Clone`.
* `CloneToUninit` is a generalization of the existing internal trait `alloc::alloc::WriteCloneIntoRaw`.
* New feature gate: `clone_to_uninit`

This allows performing `make_mut()` on `Rc<[T]>` and `Arc<[T]>`, which was not previously possible.

---

Previous PR description, now obsolete:

>  Add `{Rc, Arc}::make_mut_slice()`
>
> These functions behave identically to `make_mut()`, but operate on `Arc<[T]>` instead of `Arc<T>`.
>
> This allows performing the operation on slices, which was not previously possible because `make_mut()` requires `T: Clone` (and slices, being `!Sized`, do not and currently cannot implement `Clone`).
>
> Feature gate: `make_mut_slice`

try-job: test-various
2024-06-22 16:35:29 +00:00
Kevin Reid
88c3db57e4 Generalize {Rc,Arc}::make_mut() to unsized types.
This requires introducing a new internal type `RcUninit` (and
`ArcUninit`), which can own an `RcBox<T>` without requiring it to be
initialized, sized, or a slice. This is similar to `UniqueRc`, but
`UniqueRc` doesn't support the allocator parameter, and there is no
`UniqueArc`.
2024-06-22 08:08:00 -07:00
Kevin Reid
a9a4830d25 Replace WriteCloneIntoRaw with CloneToUninit. 2024-06-22 08:08:00 -07:00
Deadbeef
3b14b756d8 Remove feature(effects) from the standard library 2024-06-21 09:23:24 +00:00
bors
684b3553f7 Auto merge of #124032 - Voultapher:a-new-sort, r=thomcc
Replace sort implementations

This PR replaces the sort implementations with tailor-made ones that strike a balance of run-time, compile-time and binary-size, yielding run-time and compile-time improvements. Regressing binary-size for `slice::sort` while improving it for `slice::sort_unstable`. All while upholding the existing soft and hard safety guarantees, and even extending the soft guarantees, detecting strict weak ordering violations with a high chance and reporting it to users via a panic.

* `slice::sort` -> driftsort [design document](https://github.com/Voultapher/sort-research-rs/blob/main/writeup/driftsort_introduction/text.md), includes detailed benchmarks and analysis.

* `slice::sort_unstable` -> ipnsort [design document](https://github.com/Voultapher/sort-research-rs/blob/main/writeup/ipnsort_introduction/text.md), includes detailed benchmarks and analysis.

#### Why should we change the sort implementations?

In the [2023 Rust survey](https://blog.rust-lang.org/2024/02/19/2023-Rust-Annual-Survey-2023-results.html#challenges), one of the questions was: "In your opinion, how should work on the following aspects of Rust be prioritized?". The second place was "Runtime performance" and the third one "Compile Times". This PR aims to improve both.

#### Why is this one big PR and not multiple?

* The current documentation gives performance recommendations for `slice::sort` and `slice::sort_unstable`. If for example only one of them were to be changed, this advice would be misleading for some Rust versions. By replacing them atomically, the advice remains largely unchanged, and users don't have to change their code.
* driftsort and ipnsort share a substantial part of their implementations.
* The implementation of `select_nth_unstable` uses internals of `slice::sort_unstable`, which makes it impractical to split changes.

---

This PR is a collaboration with `@orlp.`
2024-06-20 20:40:43 +00:00
Lukas Bergdoll
a895ef77f0 Fix wrong big O star bracing in the doc comments 2024-06-20 18:07:04 +02:00
bors
1ca578e68e Auto merge of #126736 - matthiaskrgr:rollup-rb20oe3, r=matthiaskrgr
Rollup of 7 pull requests

Successful merges:

 - #126380 (Add std Xtensa targets support)
 - #126636 (Resolve Clippy `f16` and `f128` `unimplemented!`/`FIXME`s )
 - #126659 (More status-quo tests for the `#[coverage(..)]` attribute)
 - #126711 (Make Option::as_[mut_]slice const)
 - #126717 (Clean up some comments near `use` declarations)
 - #126719 (Fix assertion failure for some `Expect` diagnostics.)
 - #126730 (Add opaque type corner case test)

r? `@ghost`
`@rustbot` modify labels: rollup
2024-06-20 13:36:42 +00:00
Trevor Gross
5745c220e6 Stabilize hint_assert_unchecked
Make both `hint_assert_unchecked` and `const_hint_assert_unchecked`
stable as `hint_assert_unchecked`.
2024-06-19 19:31:41 -04:00
Nicholas Nethercote
665821cb60 Add blank lines after module-level //! comments.
Most modules have such a blank line, but some don't. Inserting the blank
line makes it clearer that the `//!` comments are describing the entire
module, rather than the `use` declaration(s) that immediately follows.
2024-06-20 09:23:20 +10:00
Nicholas Nethercote
09006d6a88 Convert some module-level // and /// comments to //!.
This makes their intent and expected location clearer. We see some
examples where these comments were not clearly separate from `use`
declarations, which made it hard to understand what the comment is
describing.
2024-06-20 09:23:18 +10:00
Gary Guo
ebdfcd93a3 Stabilise c_unwind 2024-06-19 13:54:51 +01:00
Lukas Bergdoll
732616998c Revert panic_safe test changes
The changes made only a limited improvement for the current small
miri coverage and in general test coverage of the sort implementations.
But they exploded test times from ~13s to ~240s, which is not deemed
worth it.
2024-06-17 22:05:35 +02:00
lukaslueg
893f95f1f7
Update Arc::try_unwrap() docs
Clarify the language wrt `race condition` not meaning `memory unsafety`.
2024-06-16 09:07:08 +02:00
Matthias Krüger
b55cabe636
Rollup merge of #126285 - kpreid:unique-rc, r=dtolnay
`UniqueRc`: support allocators and `T: ?Sized`.

Added the following (all unstable):

* Defaulted type pararameter `A: Allocator`.
* `UniqueRc::new_in()`.
* `T:  ?Sized` where possible.
* `impl CoerceUnsized for UniqueRc`.

These changes are motivated by supporting the implementation of unsized `Rc::make_mut()` (PR #116113), but are also intended to be obvious generalizations of `UniqueRc` to support the things `Rc` does.

r? ``````@the8472``````
2024-06-14 12:23:37 +02:00
Kevin Reid
27ecb71635 UniqueRc: support allocators and T: ?Sized.
Added the following (all unstable):

* Defaulted type pararameter `A: Allocator`.
* `UniqueRc::new_in()`.
* `T:  ?Sized` where possible.
* `impl CoerceUnsized for UniqueRc`.
* Drive-by doc polish: links and periods at the end of sentences.

These changes are motivated by supporting the implementation of unsized
`Rc::make_mut()` (PR #116113), but are also intended to be obvious
generalizations of `UniqueRc` to support the things `Rc` does.
2024-06-11 15:16:47 -07:00
Pietro Albini
cd2ed56502
remove cfg(bootstrap) 2024-06-11 16:52:04 +02:00
Pietro Albini
be9e27e490
replace version placeholder 2024-06-11 16:52:02 +02:00
León Orell Valerian Liehr
cbda797b77
Rollup merge of #125951 - slanterns:error_in_core_stabilization, r=Amanieu
Stabilize `error_in_core`

Closes: https://github.com/rust-lang/rust/issues/103765.

`@rustbot` label: +T-libs-api

r? libs-api
2024-06-08 04:25:44 +02:00
Matthias Krüger
0acb5b8513
Rollup merge of #124012 - slanterns:as_slice_stabilize, r=BurntSushi
Stabilize `binary_heap_as_slice`

This PR stabilizes `binary_heap_as_slice`:

```rust
// std::collections::BinaryHeap

impl BinaryHeap<T> {
    pub fn as_slice(&self) -> &[T]
}
```

<br>

Tracking issue: https://github.com/rust-lang/rust/issues/83659.
Implementation PR: https://github.com/rust-lang/rust/pull/82331.

FCPs already completed in the tracking issue.

Closes https://github.com/rust-lang/rust/issues/83659.

r? libs-api
2024-06-07 20:14:27 +02:00
Slanterns
76065f5b27
Stabilize error_in_core 2024-06-07 08:30:00 +08:00
Jubilee
448159c8e6
Rollup merge of #125982 - xTachyon:fix-linked-list, r=jhpratt
Make deleting on LinkedList aware of the allocator

Fixed #125950
2024-06-05 01:14:33 -07:00
Jubilee
9ccc7b78ec
Rollup merge of #123168 - joshtriplett:size-of-prelude, r=Amanieu
Add `size_of` and `size_of_val` and `align_of` and `align_of_val` to the prelude

(Note: need to update the PR to add `align_of` and `align_of_val`, and remove the second commit with the myriad changes to appease the lint.)

Many, many projects use `size_of` to get the size of a type. However,
it's also often equally easy to hardcode a size (e.g. `8` instead of
`size_of::<u64>()`). Minimizing friction in the use of `size_of` helps
ensure that people use it and make code more self-documenting.

The name `size_of` is unambiguous: the name alone, without any prefix or
path, is self-explanatory and unmistakeable for any other functionality.
Adding it to the prelude cannot produce any name conflicts, as any local
definition will silently shadow the one from the prelude. Thus, we don't
need to wait for a new edition prelude to add it.
2024-06-05 01:14:29 -07:00
Andrei Damian
2bad3d1392 Make deleting on LinkedList aware of the allocator 2024-06-04 18:49:13 +03:00
Lukas Wirth
d392c50ca3 Ignore vec_deque_alloc_error::test_shrink_to_unwind test on non-unwind targets 2024-06-03 13:46:56 +02:00
coekjan
7cee7c664b
stablize const_binary_heap_constructor & create an unstable feature const_binary_heap_new_in for BinaryHeap::new_in 2024-06-01 12:44:19 +08:00
Matthias Krüger
f775fffac5
Rollup merge of #125561 - Cyborus04:stabilize-slice-flatten, r=scottmcm
Stabilize `slice_flatten`
2024-05-26 13:43:07 +02:00
Cyborus
824ffd29ee
Stabilize slice_flatten 2024-05-26 01:26:24 -04:00
Matthias Krüger
7fb81229d3
Rollup merge of #123803 - Sp00ph:shrink_to_fix, r=Mark-Simulacrum
Fix `VecDeque::shrink_to` UB when `handle_alloc_error` unwinds.

Fixes #123369

For `VecDeque` it's relatively simple to restore the buffer into a consistent state so this PR does just that.

Note that with its current implementation, `shrink_to` may change the internal arrangement of elements in the buffer, so e.g. `[D, <uninit>, A, B, C]` will become `[<uninit>, A, B, C, D]` and `[<uninit>, <uninit>, A, B, C]` may become `[B, C, <uninit>, <uninit>, A]` if `shrink_to` unwinds. This shouldn't be an issue though as we don't make any guarantees about the stability of the internal buffer arrangement (and this case is impossible to hit on stable anyways).

This PR also includes a test with code adapted from #123369 which fails without the new `shrink_to` code. Does this suffice or do we maybe need more exhaustive tests like in #108475?

cc `@Amanieu`

`@rustbot` label +T-libs
2024-05-25 22:15:17 +02:00
Urgau
02eada8f8d Remove now outdated comment since we bumped stage0 2024-05-24 08:08:41 +02:00
Lzu Tao
c7d2f4592f addresss reviews 2024-05-21 18:17:55 +00:00
Lzu Tao
0734ae22f5 Update check-cfg lists for alloc 2024-05-21 18:17:55 +00:00
bors
6715446db6 Auto merge of #125358 - matthiaskrgr:rollup-mx841tg, r=matthiaskrgr
Rollup of 7 pull requests

Successful merges:

 - #124570 (Miscellaneous cleanups)
 - #124772 (Refactor documentation for Apple targets)
 - #125011 (Add opt-for-size core lib feature flag)
 - #125218 (Migrate `run-make/no-intermediate-extras` to new `rmake.rs`)
 - #125225 (Use functions from `crt_externs.h` on iOS/tvOS/watchOS/visionOS)
 - #125266 (compiler: add simd_ctpop intrinsic)
 - #125348 (Small fixes to `std::path::absolute` docs)

Failed merges:

 - #125296 (Fix `unexpected_cfgs` lint on std)

r? `@ghost`
`@rustbot` modify labels: rollup
2024-05-21 12:50:09 +00:00
Matthias Krüger
4abf179b14
Rollup merge of #125011 - diondokter:opt-for-size, r=Amanieu,kobzol
Add opt-for-size core lib feature flag

Adds a feature flag to the core library that enables the possibility to have smaller implementations for certain algorithms.

So far, the core lib has traded performance for binary size. This is likely what most people want since they have big simd-capable machines. However, people on small machines, like embedded devices, don't enjoy the potential speedup of the bigger algorithms, but do have to pay for them. These microcontrollers often only have 16-1024kB of flash memory.

This PR is the result of some talks with project members like `@Amanieu` at RustNL.
There are some open questions of how this is eventually stabilized, but it's a similar question as with the existing `panic_immediate_abort` feature.

Speaking as someone from the embedded side, we'd rather have this unstable for a while as opposed to not having it at all. In the meantime we can try to use it and also add additional PRs to the core lib that uses the feature flag in areas where we find benefit.

Open questions from my side:
- Is this a good feature name?
  - `panic_immediate_abort` is fairly verbose, so I went with something equally verbose
  - It's easy to refactor later
- I've added the feature to `std` and `alloc` as well as they might benefit too. Do we agree?
  - I expect these to get less usage out of the flag since most size-constraint projects don't use these libraries often.
2024-05-21 12:47:04 +02:00
Michael Goulet
1a81092531 Add the impls for Box<[T]>: IntoIterator
Co-authored-by: ltdk <usr@ltdk.xyz>
2024-05-20 19:21:30 -04:00
Matthias Krüger
d1da2387a4
Rollup merge of #125283 - zachs18:arc-default-shared, r=dtolnay
Use a single static for all default slice Arcs.

Also adds debug_asserts in Drop for Weak/Arc that the shared static is not being "dropped"/"deallocated".

As per https://github.com/rust-lang/rust/pull/124640#pullrequestreview-2064962003

r? dtolnay
2024-05-20 14:26:53 +02:00
Matthias Krüger
7389416284
Rollup merge of #125093 - zachs18:rc-into-raw-with-allocator-only, r=Mark-Simulacrum
Add `fn into_raw_with_allocator` to Rc/Arc/Weak.

Split out from #119761

Add `fn into_raw_with_allocator` for `Rc`/`rc::Weak`[^1]/`Arc`/`sync::Weak`.
* Pairs with `from_raw_in` (which already exists on all 4 types).
* Name matches `Box::into_raw_with_allocator`.
* Associated fns on `Rc`/`Arc`, methods on `Weak`s.

<details> <summary>Future PR/ACP</summary>

As a follow-on to this PR, I plan to make a PR/ACP later to move `into_raw(_parts)` from `Container<_, A: Allocator>` to only `Container<_, Global>` (where `Container` = `Vec`/`Box`/`Rc`/`rc::Weak`/`Arc`/`sync::Weak`) so that users of non-`Global` allocators have to explicitly handle the allocator when using `into_raw`-like APIs.

The current behaviors of stdlib containers are inconsistent with respect to what happens to the allocator when `into_raw` is called (which does not return the allocator)

| Type | `into_raw` currently callable with | behavior of `into_raw`|
| --- | --- | --- |
| `Box` | any allocator | allocator is [dropped](https://doc.rust-lang.org/nightly/src/alloc/boxed.rs.html#1060) |
| `Vec` | any allocator | allocator is [forgotten](https://doc.rust-lang.org/nightly/src/alloc/vec/mod.rs.html#884) |
| `Arc`/`Rc`/`Weak` | any allocator | allocator is [forgotten](https://doc.rust-lang.org/src/alloc/sync.rs.html#1487)(Arc) [(sync::Weak)](https://doc.rust-lang.org/src/alloc/sync.rs.html#2726) [(Rc)](https://doc.rust-lang.org/src/alloc/rc.rs.html#1352) [(rc::Weak)](https://doc.rust-lang.org/src/alloc/rc.rs.html#2993) |

In my opinion, neither implicitly dropping nor implicitly forgetting the allocator is ideal; dropping it could immediately invalidate the returned pointer, and forgetting it could unintentionally leak memory. My (to-be) proposed solution is to just forbid calling `into_raw(_parts)` on containers with non-`Global` allocators, and require calling `into_raw_with_allocator`(/`Vec::into_raw_parts_with_alloc`)

</details>

[^1]:  Technically, `rc::Weak::into_raw_with_allocator` is not newly added, as it was modified and renamed from `rc::Weak::into_raw_and_alloc`.
2024-05-20 08:31:41 +02:00
bors
12075f04e6 Auto merge of #123878 - jwong101:inplacecollect, r=jhpratt
optimize inplace collection of Vec

This PR has the following changes:

1. Using `usize::unchecked_mul` in 79424056b0/library/alloc/src/vec/in_place_collect.rs (L262) as LLVM, does not know that the operation can't wrap, since that's the size of the original allocation.

Given the following:

```rust

pub struct Foo([usize; 3]);

pub fn unwrap_copy(v: Vec<Foo>) -> Vec<[usize; 3]> {
    v.into_iter().map(|f| f.0).collect()
}
```

<details>
<summary>Before this commit:</summary>

```llvm
define void `@unwrap_copy(ptr` noalias nocapture noundef writeonly sret([24 x i8]) align 8 dereferenceable(24) %_0, ptr noalias nocapture noundef readonly align 8 dereferenceable(24) %iter) {
start:
  %me.sroa.0.0.copyload.i = load i64, ptr %iter, align 8
  %me.sroa.4.0.self.sroa_idx.i = getelementptr inbounds i8, ptr %iter, i64 8
  %me.sroa.4.0.copyload.i = load ptr, ptr %me.sroa.4.0.self.sroa_idx.i, align 8
  %me.sroa.5.0.self.sroa_idx.i = getelementptr inbounds i8, ptr %iter, i64 16
  %me.sroa.5.0.copyload.i = load i64, ptr %me.sroa.5.0.self.sroa_idx.i, align 8
  %_19.i.idx = mul nsw i64 %me.sroa.5.0.copyload.i, 24
  %0 = udiv i64 %_19.i.idx, 24

; Unnecessary calculation
  %_16.i.i = mul i64 %me.sroa.0.0.copyload.i, 24
  %dst_cap.i.i = udiv i64 %_16.i.i, 24

  store i64 %dst_cap.i.i, ptr %_0, align 8
  %1 = getelementptr inbounds i8, ptr %_0, i64 8
  store ptr %me.sroa.4.0.copyload.i, ptr %1, align 8
  %2 = getelementptr inbounds i8, ptr %_0, i64 16
  store i64 %0, ptr %2, align 8
  ret void
}
```
</details>

<details>
<summary>After:</summary>

```llvm
define void `@unwrap_copy(ptr` noalias nocapture noundef writeonly sret([24 x i8]) align 8 dereferenceable(24) %_0, ptr noalias nocapture noundef readonly align 8 dereferenceable(24) %iter) {
start:
  %me.sroa.0.0.copyload.i = load i64, ptr %iter, align 8
  %me.sroa.4.0.self.sroa_idx.i = getelementptr inbounds i8, ptr %iter, i64 8
  %me.sroa.4.0.copyload.i = load ptr, ptr %me.sroa.4.0.self.sroa_idx.i, align 8
  %me.sroa.5.0.self.sroa_idx.i = getelementptr inbounds i8, ptr %iter, i64 16
  %me.sroa.5.0.copyload.i = load i64, ptr %me.sroa.5.0.self.sroa_idx.i, align 8
  %_19.i.idx = mul nsw i64 %me.sroa.5.0.copyload.i, 24
  %0 = udiv i64 %_19.i.idx, 24
  store i64 %me.sroa.0.0.copyload.i, ptr %_0, align 8
  %1 = getelementptr inbounds i8, ptr %_0, i64 8
  store ptr %me.sroa.4.0.copyload.i, ptr %1, align 8
  %2 = getelementptr inbounds i8, ptr %_0, i64 16
  store i64 %0, ptr %2, align 8, !alias.scope !9, !noalias !14
  ret void
}
```
</details>

Note that there is still one more `mul,udiv` pair that I couldn't get
rid of. The root cause is the same issue as https://github.com/rust-lang/rust/issues/121239, the `nuw` gets
stripped off of `ptr::sub_ptr`.

2.

`Iterator::try_fold` gets called on the underlying Iterator in
`SpecInPlaceCollect::collect_in_place` whenever it does not implement
`TrustedRandomAccess`. For types that impl `Drop`, LLVM currently can't
tell that the drop can never occur, when using the default
`Iterator::try_fold` implementation.

For example, given the following code from #120493

```rust
#[repr(transparent)]
struct WrappedClone {
    inner: String
}

#[no_mangle]
pub fn unwrap_clone(list: Vec<WrappedClone>) -> Vec<String> {
    list.into_iter().map(|s| s.inner).collect()
}
```

<details>
<summary>The asm for the `unwrap_clone` method is currently:</summary>

```asm
unwrap_clone:
        push    rbp
        push    r15
        push    r14
        push    r13
        push    r12
        push    rbx
        push    rax
        mov     rbx, rdi
        mov     r12, qword ptr [rsi]
        mov     rdi, qword ptr [rsi + 8]
        mov     rax, qword ptr [rsi + 16]
        movabs  rsi, -6148914691236517205
        mov     r14, r12
        test    rax, rax
        je      .LBB0_10
        lea     rcx, [rax + 2*rax]
        lea     r14, [r12 + 8*rcx]
        shl     rax, 3
        lea     rax, [rax + 2*rax]
        xor     ecx, ecx
.LBB0_2:
        cmp     qword ptr [r12 + rcx], 0
        je      .LBB0_4
        add     rcx, 24
        cmp     rax, rcx
        jne     .LBB0_2
        jmp     .LBB0_10
.LBB0_4:
        lea     rdx, [rax - 24]
        lea     r14, [r12 + rcx]
        cmp     rdx, rcx
        je      .LBB0_10
        mov     qword ptr [rsp], rdi
        sub     rax, rcx
        add     rax, -24
        mul     rsi
        mov     r15, rdx
        lea     rbp, [r12 + rcx]
        add     rbp, 32
        shr     r15, 4
        mov     r13, qword ptr [rip + __rust_dealloc@GOTPCREL]
        jmp     .LBB0_6
.LBB0_8:
        add     rbp, 24
        dec     r15
        je      .LBB0_9
.LBB0_6:
        mov     rsi, qword ptr [rbp]
        test    rsi, rsi
        je      .LBB0_8
        mov     rdi, qword ptr [rbp - 8]
        mov     edx, 1
        call    r13
        jmp     .LBB0_8
.LBB0_9:
        mov     rdi, qword ptr [rsp]
        movabs  rsi, -6148914691236517205
.LBB0_10:
        sub     r14, r12
        mov     rax, r14
        mul     rsi
        shr     rdx, 4
        mov     qword ptr [rbx], r12
        mov     qword ptr [rbx + 8], rdi
        mov     qword ptr [rbx + 16], rdx
        mov     rax, rbx
        add     rsp, 8
        pop     rbx
        pop     r12
        pop     r13
        pop     r14
        pop     r15
        pop     rbp
        ret
```
</details>

<details>
<summary>After this PR:</summary>

```asm
unwrap_clone:
	mov	rax, rdi
	movups	xmm0, xmmword ptr [rsi]
	mov	rcx, qword ptr [rsi + 16]
	movups	xmmword ptr [rdi], xmm0
	mov	qword ptr [rdi + 16], rcx
	ret
```
</details>

Fixes https://github.com/rust-lang/rust/issues/120493
2024-05-20 00:51:12 +00:00