Avoid zero-length memcpy in formatting
This has two separate and somewhat orthogonal commits. The first change adjusts the ToString general impl for all types that implement Display; it no longer uses the full format machinery, rather directly falling onto a `std::fmt::Display::fmt` call. The second change directly adjusts the general core::fmt::write function which handles the production of format_args! to avoid zero-length push_str calls.
Both changes target the fact that push_str will still call memmove internally (or a similar function), as it doesn't know the length of the passed string. For zero-length strings in particular, this is quite expensive, and even for very short (several bytes long) strings, this is also expensive. Future work in this area may wish to have us fallback to write_char or similar, which may be cheaper on the (typically) short strings between the interpolated pieces in format_args!.
Implement the new desugaring from `try_trait_v2`
~~Currently blocked on https://github.com/rust-lang/rust/issues/84782, which has a PR in https://github.com/rust-lang/rust/pull/84811~~ Rebased atop that fix.
`try_trait_v2` tracking issue: https://github.com/rust-lang/rust/issues/84277
Unfortunately this is already touching a ton of things, so if you have suggestions for good ways to split it up, I'd be happy to hear them. (The combination between the use in the library, the compiler changes, the corresponding diagnostic differences, even MIR tests mean that I don't really have a great plan for it other than trying to have decently-readable commits.
r? `@ghost`
~~(This probably shouldn't go in during the last week before the fork anyway.)~~ Fork happened.
This avoids a zero-length write_str call, which boils down to a zero-length
memmove and ultimately costs quite a few instructions on some workloads.
This is approximately a 0.33% instruction count win on diesel-check.
BTree: no longer copy keys and values before dropping them
When dropping BTreeMap or BTreeSet instances, keys-value pairs are up to now each copied and then dropped, at least according to source code. This is because the code for dropping and for iterators is shared.
This PR postpones the treatment of doomed key-value pairs from the intermediate functions `deallocating_next`(`_back`) to the last minute, so the we can drop the keys and values in place. According to the library/alloc benchmarks, this does make a difference, (and a positive difference with an `#[inline]` on `drop_key_val`). It does not change anything for #81444 though.
r? `@Mark-Simulacrum`
Stablize {HashMap,BTreeMap}::into_{keys,values}
I would propose to stabilize `{HashMap,BTreeMap}::into_{keys,values}`( aka. `map_into_keys_values`).
Closes#75294.
For certain sorts of systems, programming, it's deemed essential that
all allocation failures be explicitly handled where they occur. For
example, see Linus Torvald's opinion in [1]. Merely not calling global
panic handlers, or always `try_reserving` first (for vectors), is not
deemed good enough, because the mere presence of the global OOM handlers
is burdens static analysis.
One option for these projects to use rust would just be to skip `alloc`,
rolling their own allocation abstractions. But this would, in my
opinion be a real shame. `alloc` has a few `try_*` methods already, and
we could easily have more. Features like custom allocator support also
demonstrate and existing to support diverse use-cases with the same
abstractions.
A natural way to add such a feature flag would a Cargo feature, but
there are currently uncertainties around how std library crate's Cargo
features may or not be stable, so to avoid any risk of stabilizing by
mistake we are going with a more low-level "raw cfg" token, which
cannot be interacted with via Cargo alone.
Note also that since there is no notion of "default cfg tokens" outside
of Cargo features, we have to invert the condition from
`global_oom_handling` to to `not(no_global_oom_handling)`. This breaks
the monotonicity that would be important for a Cargo feature (i.e.
turning on more features should never break compatibility), but it
doesn't matter for raw cfg tokens which are not intended to be
"constraint solved" by Cargo or anything else.
To support this use-case we create a new feature, "global-oom-handling",
on by default, and put the global OOM handler infra and everything else
it that depends on it behind it. By default, nothing is changed, but
users concerned about global handling can make sure it is disabled, and
be confident that all OOM handling is local and explicit.
For this first iteration, non-flat collections are outright disabled.
`Vec` and `String` don't yet have `try_*` allocation methods, but are
kept anyways since they can be oom-safely created "from parts", and we
hope to add those `try_` methods in the future.
[1]: https://lore.kernel.org/lkml/CAHk-=wh_sNLoz84AUUzuqXEsYH35u=8HV3vK-jbRbJ_B-JjGrg@mail.gmail.com/
Replace 'NULL' with 'null'
This replaces occurrences of "NULL" with "null" in docs, comments, and compiler error/lint messages. This is for the sake of consistency, as the lowercase "null" is already the dominant form in Rust. The all-caps NULL looks like the C macro (or SQL keyword), which seems out of place in a Rust context, given that NULL does not exist in the Rust language or standard library (instead having [`ptr::null()`](https://doc.rust-lang.org/stable/std/ptr/fn.null.html)).
i8 and u8::to_string() specialisation (far less asm).
Take 2. Around 1/6th of the assembly to without specialisation.
https://godbolt.org/z/bzz8Mq
(partially fixes#73533 )
Remove slice diagnostic item
...because it is unusally placed on an impl and is redundant with a lang item.
Depends on rust-lang/rust-clippy#7074 (next clippy sync). ~I expect clippy tests to fail in the meantime.~ Nope tests passed...
CC `@flip1995`
further split up const_fn feature flag
This continues the work on splitting up `const_fn` into separate feature flags:
* `const_fn_trait_bound` for `const fn` with trait bounds
* `const_fn_unsize` for unsizing coercions in `const fn` (looks like only `dyn` unsizing is still guarded here)
I don't know if there are even any things left that `const_fn` guards... at least libcore and liballoc do not need it any more.
`@oli-obk` are you currently able to do reviews?
Remove duplicated fn(Box<[T]>) -> Vec<T>
`<[T]>::into_vec()` does the same thing as `Vec::from::<Box<[T]>>()`, so they can be implemented in terms of each other. This was the previous implementation of `Vec::from()`, but was changed in #78461. I'm not sure what the rationale was for that change, but it seems preferable to maintain a single implementation.
Replace all `fmt.pad` with `debug_struct`
This replaces any occurrence of:
- `f.pad("X")` with `f.debug_struct("X").finish()`
- `f.pad("X { .. }")` with `f.debug_struct("X").finish_non_exhaustive()`
This is in line with existing formatting code such as
1255053067/library/std/src/sync/mpsc/mod.rs (L1470-L1475)
Improve code example for length comparison
Small fix/improvement: it's much safer to check that you're under the length of an array rather than chacking that you're equal to it. It's even more true in case you update the length of the array while iterating.
Add strong_count mutation methods to Rc
The corresponding methods were stabilized on `Arc` in #79285 (tracking: #71983). This patch implements and stabilizes identical methods on the `Rc` types as well.
Bump bootstrap to 1.52 beta
This includes the standard bump, but also a workaround for new cargo behavior around clearing out the doc directory when the rustdoc version changes.
BTree: move blocks around in node.rs
Without changing any names or implementation, reorder some members:
- Move down the ones defined long ago on the demised `struct Root`, to below the definition of their current host `struct NodeRef`.
- Move up some defined on `struct NodeRef` that are interspersed with those defined on `struct Handle`.
- Move up the `correct_…` methods squeezed between the two flavours of `push`.
- Move the unchecked static downcasts (`cast_to_…`) after the upcasts (`forget_`) and the (weirdly named) dynamic downcasts (`force`).
r? ````@Mark-Simulacrum````
BTree: no longer search arrays twice to check Ord
A possible addition to / partial replacement of #83147: no longer linearly search the upper bound of a range in the initial portion of the keys we already know are below the lower bound.
- Should be faster: fewer key comparisons at the cost of some instructions dealing with offsets
- Makes code a little more complicated.
- No longer detects ill-defined `Ord` implementations, but that wasn't a publicised feature, and was quite incomplete, and was only done in the `range` and `range_mut` methods.
r? `@Mark-Simulacrum`
Fix double-drop in `Vec::from_iter(vec.into_iter())` specialization when items drop during panic
This fixes the double-drop but it leaves a behavioral difference compared to the default implementation intact: In the default implementation the source and the destination vec are separate objects, so they get dropped separately. Here they share an allocation and the latter only exists as a pointer into the former. So if dropping the former panics then this fix will leak more items than the default implementation would. Is this acceptable or should the specialization also mimic the default implementation's drops-during-panic behavior?
Fixes#83618
`@rustbot` label T-libs-impl
alloc: Added `as_slice` method to `BinaryHeap` collection
I initially asked about whether it is useful addition on https://internals.rust-lang.org/t/should-i-add-as-slice-method-to-binaryheap/13816, and it seems there were no objections, so went ahead with this PR.
> There is [`BinaryHeap::into_vec`](https://doc.rust-lang.org/std/collections/struct.BinaryHeap.html#method.into_vec), but it consumes the value. I wonder if there is API design limitation that should be taken into account. Implementation-wise, the inner buffer is just a Vec, so it is trivial to expose as_slice from it.
Please, guide me through if I need to add tests or something else.
UPD: Tracking issue #83659
may not -> might not
may not -> might not
"may not" has two possible meanings:
1. A command: "You may not stay up past your bedtime."
2. A fact that's only sometimes true: "Some cities may not have bike lanes."
In some cases, the meaning is ambiguous: "Some cars may not have snow
tires." (do the cars *happen* to not have snow tires, or is it
physically impossible for them to have snow tires?)
This changes places where the standard library uses the "description of
fact" meaning to say "might not" instead.
This is just `std::vec` for now - if you think this is a good idea I can
convert the rest of the standard library.
Adjust documentation links for slice::make_ascii_*case
The documentation for the functions `slice::to_ascii_lowercase` and `slice::to_ascii_uppercase` contain the suggestion
> To lowercase the value in-place, use `make_ascii_lowercase`
however the link to the suggested method takes you to the page for `u8`, rather than the method of that name on the same page.
"may not" has two possible meanings:
1. A command: "You may not stay up past your bedtime."
2. A fact that's only sometimes true: "Some cities may not have bike lanes."
In some cases, the meaning is ambiguous: "Some cars may not have snow
tires." (do the cars *happen* to not have snow tires, or is it
physically impossible for them to have snow tires?)
This changes places where the standard library uses the "description of
fact" meaning to say "might not" instead.
This is just `std::vec` for now - if you think this is a good idea I can
convert the rest of the standard library.
Add function core::iter::zip
This makes it a little easier to `zip` iterators:
```rust
for (x, y) in zip(xs, ys) {}
// vs.
for (x, y) in xs.into_iter().zip(ys) {}
```
You can `zip(&mut xs, &ys)` for the conventional `iter_mut()` and
`iter()`, respectively. This can also support arbitrary nesting, where
it's easier to see the item layout than with arbitrary `zip` chains:
```rust
for ((x, y), z) in zip(zip(xs, ys), zs) {}
for (x, (y, z)) in zip(xs, zip(ys, zs)) {}
// vs.
for ((x, y), z) in xs.into_iter().zip(ys).zip(xz) {}
for (x, (y, z)) in xs.into_iter().zip((ys.into_iter().zip(xz)) {}
```
It may also format more nicely, especially when the first iterator is a
longer chain of methods -- for example:
```rust
iter::zip(
trait_ref.substs.types().skip(1),
impl_trait_ref.substs.types().skip(1),
)
// vs.
trait_ref
.substs
.types()
.skip(1)
.zip(impl_trait_ref.substs.types().skip(1))
```
This replaces the tuple-pair `IntoIterator` in #78204.
There is prior art for the utility of this in [`itertools::zip`].
[`itertools::zip`]: https://docs.rs/itertools/0.10.0/itertools/fn.zip.html
Add IEEE 754 compliant fmt/parse of -0, infinity, NaN
This pull request improves the Rust float formatting/parsing libraries to comply with IEEE 754's formatting expectations around certain special values, namely signed zero, the infinities, and NaN. It also adds IEEE 754 compliance tests that, while less stringent in certain places than many of the existing flt2dec/dec2flt capability tests, are intended to serve as the beginning of a roadmap to future compliance with the standard. Some relevant documentation is also adjusted with clarifying remarks.
This PR follows from discussion in https://github.com/rust-lang/rfcs/issues/1074, and closes#24623.
The most controversial change here is likely to be that -0 is now printed as -0. Allow me to explain: While there appears to be community support for an opt-in toggle of printing floats as if they exist in the naively expected domain of numbers, i.e. not the extended reals (where floats live), IEEE 754-2019 is clear that a float converted to a string should be capable of being transformed into the original floating point bit-pattern when it satisfies certain conditions (namely, when it is an actual numeric value i.e. not a NaN and the original and destination float width are the same). -0 is given special attention here as a value that should have its sign preserved. In addition, the vast majority of other programming languages not only output `-0` but output `-0.0` here.
While IEEE 754 offers a broad leeway in how to handle producing what it calls a "decimal character sequence", it is clear that the operations a language provides should be capable of round tripping, and it is confusing to advertise the f32 and f64 types as binary32 and binary64 yet have the most basic way of producing a string and then reading it back into a floating point number be non-conformant with the standard. Further, existing documentation suggested that e.g. -0 would be printed with -0 regardless of the presence of the `+` fmt character, but it prints "+0" instead if given such (which was what led to the opening of #24623).
There are other parsing and formatting issues for floating point numbers which prevent Rust from complying with the standard, as well as other well-documented challenges on the arithmetic level, but I hope that this can be the beginning of motion towards solving those challenges.
Make # pretty print format easier to discover
# Rationale:
I use (cargo cult?) three formats in rust: `{}`, debug `{:?}`, and pretty-print debug `{:#?}`. I discovered `{:#?}` in some blog post or guide when I started working in Rust. While `#` is documented I think it is hard to discover. So taking the good advice of ```@carols10cents``` I am trying to improve the docs with a PR
As a reminder "pretty print" means that where `{:?}` will print something like
```
foo: { b1: 1, b2: 2}
```
`{:#?}` will prints something like
```
foo {
b1: 1
b2: 3
}
```
# Changes
Add an example to `fmt` to try and make it easier to discover `#`
This seems to have been omitted from the beginning when this feature
was first introduced in 86bf96291d.
Most users won't need to name this type which is probably why this
wasn't noticed in the meantime.
Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
"semantic equivalence" is too strong a phrasing here, which is why
actually explaining what kind of circumstances might produce a -0
was chosen instead.
Fix invalid slice access in String::retain
As noted in #78499, the previous fix was technically still unsound because it accessed elements of a slice outside its bounds (even though they were still inside the same allocation). This PR addresses that concern by switching to a dropguard approach.
This allows the optimizer to turn certain iterator pipelines such as
```rust
let vec = vec![0usize; 100];
vec.into_iter().map(|e| e as isize).collect::<Vec<_>>()
```
into a noop.
The optimization only applies when iterator sources are `T: Copy`
since `impl TrustedRandomAccess for IntoIter<T>`.
No such requirement applies to the output type (`Iterator::Item`).
Fix overflowing length in Vec<ZST> to VecDeque
`Vec` can hold up to `usize::MAX` ZST items, but `VecDeque` has a lower
limit to keep its raw capacity as a power of two, so we should check
that in `From<Vec<T>> for VecDeque<T>`. We can also simplify the
capacity check for the remaining non-ZST case.
Before this fix, the new test would fail on the length:
```
thread 'collections::vec_deque::tests::test_from_vec_zst_overflow' panicked at 'assertion failed: `(left == right)`
left: `0`,
right: `9223372036854775808`', library/alloc/src/collections/vec_deque/tests.rs:474:5
note: panic did not contain expected string
panic message: `"assertion failed: `(left == right)`\n left: `0`,\n right: `9223372036854775808`"`,
expected substring: `"capacity overflow"`
```
That was a result of `len()` using a mask `& (size - 1)` with the
improper length. Now we do get a "capacity overflow" panic as soon as
that `VecDeque::from(vec)` is attempted.
Fixes#80167.
Implement String::remove_matches
Closes#50206.
I lifted the function help from `@frewsxcv's` original PR (#50015), hope they don't mind.
I'm also wondering whether it would be useful for `remove_matches` to collect up the removed substrings into a `Vec` and return them, right now they're just overwritten by the copy and lost.
Add more links between hash and btree collections
- Link from `core::hash` to `HashMap` and `HashSet`
- Link from HashMap and HashSet to the module-level documentation on
when to use the collection
- Link from several collections to Wikipedia articles on the general
concept
See also https://github.com/rust-lang/rust/pull/81989#issuecomment-783920840.
Vec::dedup_by optimization
Now `Vec::dedup_by` drops items in-place as it goes through them.
From my benchmarks, it is around 10% faster when T is small, with no major regression when otherwise.
I used `ptr::copy` instead of conditional `ptr::copy_nonoverlapping`, because the latter had some weird performance issues on my ryzen laptop (it was 50% slower on it than on intel/sandybridge laptop)
It would be good if someone was able to reproduce these results.
`Vec` can hold up to `usize::MAX` ZST items, but `VecDeque` has a lower
limit to keep its raw capacity as a power of two, so we should check
that in `From<Vec<T>> for VecDeque<T>`. We can also simplify the
capacity check for the remaining non-ZST case.
Before this fix, the new test would fail on the length:
```
thread 'collections::vec_deque::tests::test_from_vec_zst_overflow' panicked at 'assertion failed: `(left == right)`
left: `0`,
right: `9223372036854775808`', library/alloc/src/collections/vec_deque/tests.rs:474:5
note: panic did not contain expected string
panic message: `"assertion failed: `(left == right)`\n left: `0`,\n right: `9223372036854775808`"`,
expected substring: `"capacity overflow"`
```
That was a result of `len()` using a mask `& (size - 1)` with the
improper length. Now we do get a "capacity overflow" panic as soon as
that `VecDeque::from(vec)` is attempted.