The return of the GroupBy and GroupByMut iterators on slice
According to https://github.com/rust-lang/rfcs/pull/2477#issuecomment-742034372, I am opening this PR again, this time I implemented it in safe Rust only, it is therefore much easier to read and is completely safe.
This PR proposes to add two new methods to the slice, the `group_by` and `group_by_mut`. These two methods provide a way to iterate over non-overlapping sub-slices of a base slice that are separated by the predicate given by the user (e.g. `Partial::eq`, `|a, b| a.abs() < b.abs()`).
```rust
let slice = &[1, 1, 1, 3, 3, 2, 2, 2];
let mut iter = slice.group_by(|a, b| a == b);
assert_eq!(iter.next(), Some(&[1, 1, 1][..]));
assert_eq!(iter.next(), Some(&[3, 3][..]));
assert_eq!(iter.next(), Some(&[2, 2, 2][..]));
assert_eq!(iter.next(), None);
```
[An RFC](https://github.com/rust-lang/rfcs/pull/2477) was open 2 years ago but wasn't necessary.
Rollup of 9 pull requests
Successful merges:
- #78934 (refactor: removing library/alloc/src/vec/mod.rs ignore-tidy-filelength)
- #79479 (Add `Iterator::intersperse`)
- #80128 (Edit rustc_ast::ast::FieldPat docs)
- #80424 (Don't give an error when creating a file for the first time)
- #80458 (Some Promotion Refactoring)
- #80488 (Do not create dangling &T in Weak<T>::drop)
- #80491 (Miri: make size/align_of_val work for dangling raw ptrs)
- #80495 (Rename kw::Invalid -> kw::Empty)
- #80513 (Add regression test for #80062)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
Do not create dangling &T in Weak<T>::drop
Since at this point all strong pointers have been dropped, the wrapped `T` has also been dropped. As such, creating a `&T` to the dropped place is negligent at best (language UB at worst). Since we have `Layout::for_value_raw` now, use that instead of `Layout::for_value` to avoid creating the `&T`.
This does have implications for custom (potentially thin) DSTs, though much less severe than those discussed in #80407. Specifically, one of two things has to be true:
- It has to be possible to use a `*const T` to a dropped (potentially custom, potentially thin) unsized tailed object to determine the layout (size/align) of the object. This is what is currently implemented (though with `&T` instead of `&T`). The validity of reading some location after it has been dropped is an open question IIUC (https://github.com/rust-lang/unsafe-code-guidelines/issues/188) (except when the whole type is `Copy`, per `drop_in_place`'s docs).
In this design, custom DSTs would get a `*mut T` and use that to return layout, and must be able to do so while in the "zombie" (post-drop, pre-free) state.
- `RcBox`/`ArcInner` compute and store layout eagerly, so that they don't have to ask the type for its layout after dropping it.
Importantly, this is already true today, as you can construct `Rc<DST>`, create a `Weak<DST>`, and drop the `Rc` before the `Weak`. This PR is a strict improvement over the status quo, and the above question about potentially thin DSTs will need to be resolved by any custom DST proposal.
Add "length" as doc alias to len methods
Currently when searching for `length` there are no results: https://doc.rust-lang.org/std/?search=length. This makes `len` methods appear when searching for `length`.
BTreeMap: clean up access to MaybeUninit arrays
Stop exposing and using immutable access to `MaybeUninit` slices when we need and have exclusive access to the tree.
r? `@Mark-Simulacrum`
BTreeMap: relax the explicit borrow rule to make code shorter and safer
Expressions like `.reborrow_mut().into_len_mut()` are annoyingly long, and kind of dangerous for the reason `reborrow_mut()` is unsafe. By relaxing the single rule, we no longer have to make an exception for functions with a `borrow` name and functions like `as_leaf_mut`. This is largely restoring the declaration style of the btree::node API about a year ago, but with more explanation and consistency.
r? `@Mark-Simulacrum`
Stabilize or_insert_with_key
Stabilizes the `or_insert_with_key` feature from https://github.com/rust-lang/rust/issues/71024. This allows inserting key-derived values when a `HashMap`/`BTreeMap` entry is vacant.
The difference between this and `.or_insert_with(|| ... )` is that this provides a reference to the key to the closure after it is moved with `.entry(key_being_moved)`, avoiding the need to copy or clone the key.
Fix overflow when converting ZST Vec to VecDeque
```rust
let v = vec![(); 100];
let queue = VecDeque::from(v);
println!("{:?}", queue);
```
This code will currently panic with a capacity overflow.
This PR resolves this issue and makes the code run fine.
Resolves#78532
Do not inline finish_grow
Fixes#78471.
Looking at libgkrust.a in Firefox, the sizes for the `gkrust.*.o` file is:
- 18584816 (text) 582418 (data) with unmodified master
- 17937659 (text) 582554 (data) with #72227 reverted
- 17968228 (text) 582858 (data) with `#[inline(never)]` on `grow_amortized` and `grow_exact`, but that has some performance consequences
- 17927760 (text) 582322 (data) with this change
So in terms of size, at least in the case of Firefox, this patch more than undoes the regression. I don't think it should affect performance, but we'll see.
doc(array,vec): add notes about side effects when empty-initializing
Copying some context from a conversation in the Rust discord:
* Both `vec![T; 0]` and `[T; 0]` are syntactically valid, and produce empty containers of their respective types
* Both *also* have side effects:
```rust
fn side_effect() -> String {
println!("side effect!");
"foo".into()
}
fn main() {
println!("before!");
let x = vec![side_effect(); 0];
let y = [side_effect(); 0];
println!("{:?}, {:?}", x, y);
}
```
produces:
```
before!
side effect!
side effect!
[], []
```
This PR just adds two small notes to each's documentation, warning users that side effects can occur.
I've also submitted a clippy proposal: https://github.com/rust-lang/rust-clippy/issues/6439
BTreeMap: clarify comments and panics around choose_parent_kv
Fixes a lie in recent code: `unreachable!("empty non-root node")` should shout "empty internal node", but it might as well be good and keep quiet
r? `@Mark-Simulacrum`
Clarify that String::split_at takes a byte index.
To someone skimming through the `String` docs and only reads the first line, the person could interpret "index" to be "char index". Later on in the docs it clarifies, but by adding "byte" it removes that ambiguity.
Privatize some of libcore unicode_internals
My understanding is that these API are perma unstable, so it doesn't
make sense to pollute docs & IDE completion[1] with them.
[1]: https://github.com/rust-analyzer/rust-analyzer/issues/6738
We also change the specialization of `SpecFromIterNested::from_iter` for
`TrustedLen` to use `Vec::with_capacity` when the iterator has a proper size
hint, instead of `Vec::new`, avoiding calls to `grow_*` and thus
`finish_grow` in some fully inlinable cases, which would regress with
this change.
Fixes#78471.
BTreeMap: try to enhance various comments
All in internal documentation, propagating the "key-value pair" notation from public documentation.
r? ``@Mark-Simulacrum``
Require allocator to be static for boxed `Pin`-API
Allocators has to retain their validity until the instance and all of its clones are dropped. When pinning a value, it must live forever, thus, the allocator requires a `'static` lifetime for pinning a value. [Example from reddit](https://www.reddit.com/r/rust/comments/jymzdw/the_story_continues_vec_now_supports_custom/gd7qak2?utm_source=share&utm_medium=web2x&context=3):
```rust
let alloc = MyAlloc(/* ... */);
let pinned = Box::pin_in(42, alloc);
mem::forget(pinned); // Now `value` must live forever
// Otherwise `Pin`'s invariants are violated, storage invalidated
// before Drop was called.
// borrow of `memory` can end here, there is no value keeping it.
drop(alloc); // Oh, value doesn't live forever.
```
Rename `optin_builtin_traits` to `auto_traits`
They were originally called "opt-in, built-in traits" (OIBITs), but
people realized that the name was too confusing and a mouthful, and so
they were renamed to just "auto traits". The feature flag's name wasn't
updated, though, so that's what this PR does.
There are some other spots in the compiler that still refer to OIBITs,
but I don't think changing those now is worth it since they are internal
and not particularly relevant to this PR.
Also see <https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/opt-in.2C.20built-in.20traits.20(auto.20traits).20feature.20name>.
r? `@oli-obk` (feel free to re-assign if you're not the right reviewer for this)
They were originally called "opt-in, built-in traits" (OIBITs), but
people realized that the name was too confusing and a mouthful, and so
they were renamed to just "auto traits". The feature flag's name wasn't
updated, though, so that's what this PR does.
There are some other spots in the compiler that still refer to OIBITs,
but I don't think changing those now is worth it since they are internal
and not particularly relevant to this PR.
Also see <https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/opt-in.2C.20built-in.20traits.20(auto.20traits).20feature.20name>.
Rollup of 10 pull requests
Successful merges:
- #76829 (stabilize const_int_pow)
- #79080 (MIR visitor: Don't treat debuginfo field access as a use of the struct)
- #79236 (const_generics: assert resolve hack causes an error)
- #79287 (Allow using generic trait methods in `const fn`)
- #79324 (Use Option::and_then instead of open-coding it)
- #79325 (Reduce boilerplate with the `?` operator)
- #79330 (Fix typo in comment)
- #79333 (doc typo)
- #79337 (Use Option::map instead of open coding it)
- #79343 (Add my (`@flip1995)` work mail to the mailmap)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
Change slice::to_vec to not use extend_from_slice
I saw this [Zulip thread](https://rust-lang.zulipchat.com/#narrow/stream/219381-t-libs/topic/String.3A.3Afrom%28.26str%29.20wonky.20codegen/near/216164455), and didn't see any update from it, so I thought I'd try to fix it. This converts `to_vec` to no longer use `extend_from_slice`, but relies on knowing that the allocated capacity is the same size as the input.
[Godbolt new v1](https://rust.godbolt.org/z/1bcWKG)
[Godbolt new v2 w/ drop guard](https://rust.godbolt.org/z/5jn76K)
[Godbolt old version](https://rust.godbolt.org/z/e4ePav)
After some amount of iteration, there are now two specializations for `to_vec`, one for `Copy` types that use memcpy, and one for clone types which is the original from this PR.
This is then used inside of `impl<T: Clone> FromIterator<Iter::Slice<T>> for Vec<T>` which is essentially equivalent to `&[T] -> Vec<T>`, instead of previous specialization of the `extend` function. This is because extend has to reason more about existing capacity by calling `reserve` on an existing vec, and thus produces worse asm.
Downsides: This allocates the exact capacity, so I think if many items are added to this `Vec` after, it might need to allocate whereas extending may not. I also noticed the number of faults went up in the benchmarks, but not sure where from exactly.
This also required adding a loop guard in case clone panics
Add specialization for copy
There is a better version for copy, so I've added specialization for that function
and hopefully that should speed it up even more.
Switch FromIter<slice::Iter> to use `to_vec`
Test different unrolling version for to_vec
Revert to impl
From benchmarking, it appears this version is faster
BTreeMap: swap the names of NodeRef::new and Root::new_leaf
#78104 preserved the name of Root::new_leaf to minimize changes, but the resulting names are confusing.
r? `@Mark-Simulacrum`
BTreeMap: address namespace conflicts
Fix an annoyance popping up whenever synchronizing the test cases with a version capable of miri-track-raw-pointers.
r? `@Mark-Simulacrum`
More consistently use spaces after commas in lists in docs
This PR changes instances of lists that didn't use spaces after commas, like `vec![1,2,3]`, to `vec![1, 2, 3]` to be more consistent with idiomatic Rust style (the way these were looks strange to me, especially because there are often lists that *do* use spaces after the commas later in the same code block 😬).
I noticed one of these in an example in the stdlib docs and went looking for more, but as far as I can see, I'm only changing those spots in user-facing documentation or rustc output, and the changes make no semantic difference.
clarify rules for ZST Boxes
LLVM's rules around `getelementptr inbounds` with offset 0 are a bit annoying, and as a consequence we have no choice but say that a `Box<()>` pointing to previously allocated memory that has since been freed is UB. Clarify the docs to reflect this.
This is based on conversations on the LLVM mailing list.
* Here's my initial mail: https://lists.llvm.org/pipermail/llvm-dev/2019-February/130452.html
* The first email of the March part of that thread: https://lists.llvm.org/pipermail/llvm-dev/2019-March/130831.html
* First email of the April part: https://lists.llvm.org/pipermail/llvm-dev/2019-April/131693.html
The conclusion for me at least was that `getelementptr inbounds` with offset 0 is *not* the identity function, but can sometimes return `poison` even when the input is a regular pointer -- specifically, it returns `poison` when this pointer points into something that LLVM "knows has been deallocated", i.e., a former LLVM-managed allocation. It is however the identity function on pointers obtained by casting integers.
Note that there [are formal proposals](https://people.mpi-sws.org/~jung/twinsem/twinsem.pdf) for LLVM semantics where `getelementptr inbounds` with offset 0 isn't quite the identity function but never returns `poison` (it affects the provenance of the pointer but in a way that doesn't matter if this pointer is never used for memory accesses), and indeed this is likely necessary to consistently describe LLVM semantics. But with the informal LLVM LangRef that we have right now, and with LLVM devs insisting otherwise, it seems unwise to rely on this.
Rename/Deprecate LayoutErr in favor of LayoutError
Implements rust-lang/wg-allocators#73.
This patch renames LayoutErr to LayoutError, and uses a type alias to support users using the old name.
The new name will be instantly stable in release 1.49 (current nightly), the type alias will become deprecated in release 1.51 (so that when the current nightly is 1.51, 1.49 will be stable).
This is the only error type in `std` that ends in `Err` rather than `Error`, if this PR lands all stdlib error types will end in `Error` 🥰
BTreeMap: fix pointer provenance rules in underfullness
Continuing on #78480, and for readability, and possibly for performance: avoid aliasing when handling underfull nodes, and consolidate the code doing that. In particular:
- Avoid the rather explicit aliasing for internal nodes in `remove_kv_tracking`.
- Climb down to the root to handle underfull nodes using a reborrowed handle, rather than one copied with `ptr::read`, before resuming on the leaf level.
- Integrate the code tracking leaf edge position into the functions performing changes, rather than bolting it on.
r? `@Mark-Simulacrum`
Implement BTreeMap::retain and BTreeSet::retain
Adds new methods `BTreeMap::retain` and `BTreeSet::retain`. These are implemented on top of `drain_filter` (#70530).
The API of these methods is identical to `HashMap::retain` and `HashSet::retain`, which were implemented in #39560 and stabilized in #36648. The docs and tests are also copied from HashMap/HashSet.
The new methods are unstable, behind the `btree_retain` feature gate, with tracking issue #79025. See also rust-lang/rfcs#1338.
commit c547d5fabcd756515afa7263ee5304965bb4c497
Author: C <DeveloperC@protonmail.com>
Date: Sat Oct 31 11:22:23 2020 +0000
test: updating ui/hygiene/panic-location.rs expected
commit 2af03769c4ffdbbbad75197a1ad0df8c599186be
Author: C <DeveloperC@protonmail.com>
Date: Sat Oct 31 10:43:30 2020 +0000
fix: documentation unresolved link
commit c4b0df361ce27d7392d8016229f2e0265af32086
Author: C <DeveloperC@protonmail.com>
Date: Sat Oct 31 02:58:31 2020 +0000
style: compiling with Rust's style guidelines
commit bdd2de5f3c09b49a18e3293f2457fcab25557c96
Author: C <DeveloperC@protonmail.com>
Date: Sat Oct 31 02:56:31 2020 +0000
refactor: removing ignore-tidy-filelength
commit fcc4b3bc41f57244c65ebb8e4efe4cbc9460b5a9
Author: C <DeveloperC@protonmail.com>
Date: Sat Oct 31 02:51:35 2020 +0000
refactor: moving trait RingSlices to ring_slices.rs
commit 2f0cc539c06d8841baf7f675168f68ca7c21e68e
Author: C <DeveloperC@protonmail.com>
Date: Sat Oct 31 02:46:09 2020 +0000
refactor: moving struct PairSlices to pair_slices.rs
commit a55d3ef1dab4c3d85962b3a601ff8d1f7497faf2
Author: C <DeveloperC@protonmail.com>
Date: Sat Oct 31 02:31:45 2020 +0000
refactor: moving struct Iter to iter.rs
commit 76ab33a12442a03726f36f606b4e0fe70f8f246b
Author: C <DeveloperC@protonmail.com>
Date: Sat Oct 31 02:24:32 2020 +0000
refactor: moving struct IntoIter into into_iter.rs
commit abe0d9eea2933881858c3b1bc09df67cedc5ada5
Author: C <DeveloperC@protonmail.com>
Date: Sat Oct 31 02:19:07 2020 +0000
refactor: moving struct IterMut into iter_mut.rs
commit 70ebd6420335e1895e2afa2763a0148897963e24
Author: C <DeveloperC@protonmail.com>
Date: Sat Oct 31 01:49:15 2020 +0000
refactor: moved macros into macros.rs
commit b08dd2add994b04ae851aa065800bd8bd6326134
Author: C <DeveloperC@protonmail.com>
Date: Sat Oct 31 01:05:36 2020 +0000
refactor: moving vec_deque.rs to vec_deque/mod.rs
Improve BinaryHeap performance
By changing the condition in the loops from `child < end` to `child < end - 1` we're guaranteed that `right = child + 1 < end` and since finding the index of the biggest sibling can be done with an arithmetic operation we can remove a branch from the loop body. The case where there's no right child, i.e. `child == end - 1` is instead handled outside the loop, after it ends; note that if the loops ends early we can use `return` instead of `break` since the check `child == end - 1` will surely fail.
I've also removed a call to `<[T]>::swap` that was hiding a bound check that [wasn't being optimized by LLVM](https://godbolt.org/z/zrhdGM).
A quick benchmarks on my pc shows that the gains are pretty significant:
|name |before ns/iter |after ns/iter |diff ns/iter |diff % |speedup |
|---------------------|----------------|---------------|--------------|----------|--------|
|find_smallest_1000 | 352,565 | 260,098 | -92,467 | -26.23% | x 1.36 |
|from_vec | 676,795 | 473,934 | -202,861 | -29.97% | x 1.43 |
|into_sorted_vec | 469,511 | 304,275 | -165,236 | -35.19% | x 1.54 |
|pop | 483,198 | 373,778 | -109,420 | -22.64% | x 1.29 |
The other 2 benchmarks for `BinaryHeap` (`peek_mut_deref_mut` and `push`) weren't impacted and as such didn't show any significant change.