Implement iterator specialization traits on more adapters
This adds
* `TrustedLen` to `Skip` and `StepBy`
* `TrustedRandomAccess` to `Skip`
* `InPlaceIterable` and `SourceIter` to `Copied` and `Cloned`
The first two might improve performance in the compiler itself since `skip` is used in several places. Constellations that would exercise the last point are probably rare since it would require an owning iterator that has references as Items somewhere in its iterator pipeline.
Improvements for `Skip`:
```
# old
test iter::bench_skip_trusted_random_access ... bench: 8,335 ns/iter (+/- 90)
# new
test iter::bench_skip_trusted_random_access ... bench: 2,753 ns/iter (+/- 27)
```
Remove alignment-changing in-place collect
This removes the alignment-changing in-place collect optimization introduced in #110353
Currently stable users can't benefit from the optimization because GlobaAlloc doesn't support alignment-changing realloc and neither do most posix allocators. So in practice it has a negative impact on performance.
Explanation from https://github.com/rust-lang/rust/issues/120091#issuecomment-1899071681:
> > You mention that in case of alignment mismatch -- when the new alignment is less than the old -- the implementation calls `mremap`.
>
> I was trying to note that this isn't really the case in practice, due to the semantics of Rust's allocator APIs. The only use of the allocator within the `in_place_collect` implementation itself is [a call to `Allocator::shrink()`](db7125f008/library/alloc/src/vec/in_place_collect.rs (L299-L303)), which per its documentation [allows decreasing the required alignment](https://doc.rust-lang.org/1.75.0/core/alloc/trait.Allocator.html). However, in stable Rust, the only available `Allocator` is [`Global`](https://doc.rust-lang.org/1.75.0/alloc/alloc/struct.Global.html), which delegates to the registered `GlobalAlloc`. Since `GlobalAlloc::realloc()` [cannot change the required alignment](https://doc.rust-lang.org/1.75.0/core/alloc/trait.GlobalAlloc.html#method.realloc), the implementation of [`<Global as Allocator>::shrink()`](db7125f008/library/alloc/src/alloc.rs (L280-L321)) must fall back to creating a brand-new allocation, `memcpy`ing the data into it, and freeing the old allocation, whenever the alignment doesn't remain exactly the same.
>
> Therefore, the underlying allocator, provided by libc or some other source, has no opportunity to internally `mremap()` the data when the alignment is changed, since it has no way of knowing that the allocation is the same.
Update documentation for Vec::into_boxed_slice to be more clear about excess capacity
Currently, the documentation for Vec::into_boxed_slice says that "if the vector has excess capacity, its items will be moved into a newly-allocated buffer with exactly the right capacity." This is misleading, as copies do not necessarily occur, depending on if the allocator supports in-place shrinking. I copied some of the wording from shrink_to_fit, though it could potentially still be worded better than this.
Currently stable users can't benefit from this because GlobaAlloc doesn't support
alignment-changing realloc and neither do most posix allocators.
So in practice it always results in an extra memcpy.
merge core_panic feature into panic_internals
I don't know why those are two separate features, but it does not seem intentional. This merge is useful because with https://github.com/rust-lang/rust/pull/118123, panic_internals is recognized as an internal feature, but core_panic is not -- but core_panic definitely should be internal.
rc: Take *const T in is_dangling
It is not important which one is used since `is_dangling` does not access memory, but `*const` removes the needs of `*const T` -> `*mut T` casts in `from_raw_in`.
Clean up alloc::sync::Weak Clone implementation
Since both return points (tail and early return) return the same expression and the only difference is whether inner is available, the code that does the atomic operations and checks on inner was moved into the if body and the only return is at the tail. Original comments preserved.
It is not important which one is used since `is_dangling` does not access
memory, but `*const` removes the needs of `*const T` -> `*mut T` casts
in `from_raw_in`.
fix minor mistake in comments describing VecDeque resizing
Avoiding confusion where one of the items in the deque seems to disappear in two of the three cases
Since both return points (tail and early return) return the same
expression and the only difference is whether inner is available, the
code that does the atomic operations and checks on inner was moved into
the if body and the only return is at the tail. Original comments
preserved.
add more niches to rawvec
Previously RawVec only had a single niche in its `NonNull` pointer. With this change it now has `isize::MAX` niches since half the value-space of the capacity field is never needed, we can't have a capacity larger than isize::MAX.
Add lint against ambiguous wide pointer comparisons
This PR is the resolution of https://github.com/rust-lang/rust/issues/106447 decided in https://github.com/rust-lang/rust/issues/117717 by T-lang.
## `ambiguous_wide_pointer_comparisons`
*warn-by-default*
The `ambiguous_wide_pointer_comparisons` lint checks comparison of `*const/*mut ?Sized` as the operands.
### Example
```rust
let ab = (A, B);
let a = &ab.0 as *const dyn T;
let b = &ab.1 as *const dyn T;
let _ = a == b;
```
### Explanation
The comparison includes metadata which may not be expected.
-------
This PR also drops `clippy::vtable_address_comparisons` which is superseded by this one.
~~One thing: is the current naming right? `invalid` seems a bit too much.~~
Fixes https://github.com/rust-lang/rust/issues/117717
remove redundant imports
detects redundant imports that can be eliminated.
for #117772 :
In order to facilitate review and modification, split the checking code and removing redundant imports code into two PR.
r? `@petrochenkov`
Stablize arc_unwrap_or_clone
Fixes: #93610
This likely needs FCP. I created this PR as it's stabilization is trivial and FCP can be just conducted here. Not sure how to ping the libs API team (last attempt didn't work apparently according to GH UI)
detects redundant imports that can be eliminated.
for #117772 :
In order to facilitate review and modification, split the checking code and
removing redundant imports code into two PR.
Split `Vec::dedup_by` into 2 cycles
First cycle runs until we found 2 same elements, second runs after if there any found in the first one. This allows to avoid any memory writes until we found an item which we want to remove.
This leads to significant performance gains if all `Vec` items are kept: -40% on my benchmark with unique integers.
Results of benchmarks before implementation (including new benchmark where nothing needs to be removed):
* vec::bench_dedup_all_100 74.00ns/iter +/- 13.00ns
* vec::bench_dedup_all_1000 572.00ns/iter +/- 272.00ns
* vec::bench_dedup_all_100000 64.42µs/iter +/- 19.47µs
* __vec::bench_dedup_none_100 67.00ns/iter +/- 17.00ns__
* __vec::bench_dedup_none_1000 662.00ns/iter +/- 86.00ns__
* __vec::bench_dedup_none_10000 9.16µs/iter +/- 2.71µs__
* __vec::bench_dedup_none_100000 91.25µs/iter +/- 1.82µs__
* vec::bench_dedup_random_100 105.00ns/iter +/- 11.00ns
* vec::bench_dedup_random_1000 781.00ns/iter +/- 10.00ns
* vec::bench_dedup_random_10000 9.00µs/iter +/- 5.62µs
* vec::bench_dedup_random_100000 449.81µs/iter +/- 74.99µs
* vec::bench_dedup_slice_truncate_100 105.00ns/iter +/- 16.00ns
* vec::bench_dedup_slice_truncate_1000 2.65µs/iter +/- 481.00ns
* vec::bench_dedup_slice_truncate_10000 18.33µs/iter +/- 5.23µs
* vec::bench_dedup_slice_truncate_100000 501.12µs/iter +/- 46.97µs
Results after implementation:
* vec::bench_dedup_all_100 75.00ns/iter +/- 9.00ns
* vec::bench_dedup_all_1000 494.00ns/iter +/- 117.00ns
* vec::bench_dedup_all_100000 58.13µs/iter +/- 8.78µs
* __vec::bench_dedup_none_100 52.00ns/iter +/- 22.00ns__
* __vec::bench_dedup_none_1000 417.00ns/iter +/- 116.00ns__
* __vec::bench_dedup_none_10000 4.11µs/iter +/- 546.00ns__
* __vec::bench_dedup_none_100000 40.47µs/iter +/- 5.36µs__
* vec::bench_dedup_random_100 77.00ns/iter +/- 15.00ns
* vec::bench_dedup_random_1000 681.00ns/iter +/- 86.00ns
* vec::bench_dedup_random_10000 11.66µs/iter +/- 2.22µs
* vec::bench_dedup_random_100000 469.35µs/iter +/- 20.53µs
* vec::bench_dedup_slice_truncate_100 100.00ns/iter +/- 5.00ns
* vec::bench_dedup_slice_truncate_1000 2.55µs/iter +/- 224.00ns
* vec::bench_dedup_slice_truncate_10000 18.95µs/iter +/- 2.59µs
* vec::bench_dedup_slice_truncate_100000 492.85µs/iter +/- 72.84µs
Resolves#77772
P.S. Note that this is same PR as #92104 I just missed review then forgot about it.
Also, I cannot reopen that pull request so I am creating a new one.
I responded to remaining questions directly by adding commentaries to my code.
Add support for making lib features internal
We have the notion of an "internal" lang feature: a feature that is never intended to be stabilized, and using which can cause ICEs and other issues without that being considered a bug.
This extends that idea to lib features as well. It is an alternative to https://github.com/rust-lang/rust/pull/115623: instead of using an attribute to declare lib features internal, we simply do this based on the name. Everything ending in `_internals` or `_internal` is considered internal.
Then we rename `core_intrinsics` to `core_intrinsics_internal`, which fixes https://github.com/rust-lang/rust/issues/115597.
Expand in-place iteration specialization to Flatten, FlatMap and ArrayChunks
This enables the following cases to collect in-place:
```rust
let v = vec![[0u8; 4]; 1024]
let v: Vec<_> = v.into_iter().flatten().collect();
let v: Vec<Option<NonZeroUsize>> = vec![NonZeroUsize::new(0); 1024];
let v: Vec<_> = v.into_iter().flatten().collect();
let v = vec![u8; 4096];
let v: Vec<_> = v.into_iter().array_chunks::<4>().collect();
```
Especially the nicheful-option-flattening should be useful in real code.