Stabilize the `core_c_str` and `alloc_c_string` feature gates.
Change `std::ffi` to re-export these types rather than creating type
aliases, since they now have matching stability.
Stabilize `core::ffi:c_*` and rexport in `std::ffi`
This only stabilizes the base types, not the non-zero variants, since
those have their own separate tracking issue and have not gone through
FCP to stabilize.
This only stabilizes the base types, not the non-zero variants, since
those have their own separate tracking issue and have not gone through
FCP to stabilize.
Enforce that layout size fits in isize in Layout
As it turns out, enforcing this _in APIs that already enforce `usize` overflow_ is fairly trivial. `Layout::from_size_align_unchecked` continues to "allow" sizes which (when rounded up) would overflow `isize`, but these are now declared as library UB for `Layout`, meaning that consumers of `Layout` no longer have to check this before making an allocation.
(Note that this is "immediate library UB;" IOW it is valid for a future release to make this immediate "language UB," and there is an extant patch to do so, to allow Miri to catch this misuse.)
See also #95252, [Zulip discussion](https://rust-lang.zulipchat.com/#narrow/stream/219381-t-libs/topic/Layout.20Isn't.20Enforcing.20The.20isize.3A.3AMAX.20Rule).
Fixes https://github.com/rust-lang/rust/issues/95334
Some relevant quotes:
`@eddyb,` https://github.com/rust-lang/rust/pull/95252#issuecomment-1078513769
> [B]ecause of the non-trivial presence of both of these among code published on e.g. crates.io:
>
> 1. **`Layout` "producers" / `GlobalAlloc` "users"**: smart pointers (including `alloc::rc` copies with small tweaks), collections, etc.
> 2. **`Layout` "consumers" / `GlobalAlloc` "providers"**: perhaps fewer of these, but anything built on top of OS APIs like `mmap` will expose `> isize::MAX` allocations (on 32-bit hosts) if they lack extra checks
>
> IMO the only responsible option is to enforce the `isize::MAX` limit in `Layout`, which:
>
> * makes `Layout` _sound_ in terms of only ever allowing allocations where `(alloc_base_ptr: *mut u8).offset(size)` is never UB
> * frees both "producers" and "consumers" of `Layout` from manually reimplementing the checks
> * manual checks can be risky, e.g. if the final size passed to the allocator isn't the one being checked
> * this applies retroactively, fixing the overall soundness of existing code with zero transition period or _any_ changes required from users (as long as going through `Layout` is mandatory, making a "choke point")
>
>
> Feel free to quote this comment onto any relevant issue, I might not be able to keep track of developments.
`@Gankra,` https://github.com/rust-lang/rust/pull/95252#issuecomment-1078556371
> As someone who spent way too much time optimizing libcollections checks for this stuff and tried to splatter docs about it everywhere on the belief that it was a reasonable thing for people to manually take care of: I concede the point, it is not reasonable. I am wholy spiritually defeated by the fact that _liballoc_ of all places is getting this stuff wrong. This isn't throwing shade at the folks who implemented these Rc features, but rather a statement of how impractical it is to expect anyone out in the wider ecosystem to enforce them if _some of the most audited rust code in the library that defines the very notion of allocating memory_ can't even reliably do it.
>
> We need the nuclear option of Layout enforcing this rule. Code that breaks this rule is _deeply_ broken and any "regressions" from changing Layout's contract is a _correctness_ fix. Anyone who disagrees and is sufficiently motivated can go around our backs but the standard library should 100% refuse to enable them.
cc also `@RalfJung` `@rust-lang/wg-allocators.` Even though this technically supersedes #95252, those potential failure points should almost certainly still get nicer panics than just "unwrap failed" (which they would get by this PR).
It might additionally be worth recommending to users of the `Layout` API that they should ideally use `.and_then`/`?` to complete the entire layout calculation, and then `panic!` from a single location at the end of `Layout` manipulation, to reduce the overhead of the checks and optimizations preserving the exact location of each `panic` which are conceptually just one failure: allocation too big.
Probably deserves a T-lang and/or T-libs-api FCP (this technically solidifies the [objects must be no larger than `isize::MAX`](https://rust-lang.github.io/unsafe-code-guidelines/layout/scalars.html#isize-and-usize) rule further, and the UCG document says this hasn't been RFCd) and a crater run. Ideally, no code exists that will start failing with this addition; if it does, it was _likely_ (but not certainly) causing UB.
Changes the raw_vec allocation path, thus deserves a perf run as well.
I suggest hiding whitespace-only changes in the diff view.
Make `ThinBox<T>` covariant in `T`
Just like `Box<T>`, we want `ThinBox<T>` to be covariant in `T`, but the
projection in `WithHeader<<T as Pointee>::Metadata>` was making it
invariant. This is now hidden as `WithOpaqueHeader`, which we type-cast
whenever the real `WithHeader<H>` type is needed.
Fixes the problem noted in <https://github.com/rust-lang/rust/issues/92791#issuecomment-1104636249>.
Just like `Box<T>`, we want `ThinBox<T>` to be covariant in `T`, but the
projection in `WithHeader<<T as Pointee>::Metadata>` was making it
invariant. This is now hidden as `WithOpaqueHeader`, which we type-cast
whenever the real `WithHeader<H>` type is needed.
Documentation for the following methods
with_capacity
with_capacity_in
with_capacity_and_hasher
reserve
reserve_exact
try_reserve
try_reserve_exact
was inconsistent and often not entirely correct where they existed on the following types
Vec
VecDeque
String
OsString
PathBuf
BinaryHeap
HashSet
HashMap
BufWriter
LineWriter
since the allocator is allowed to allocate more than the requested capacity in all such cases, and will frequently "allocate" much more in the case of zero-sized types (I also checked BufReader, but there the docs appear to be accurate as it appears to actually allocate the exact capacity).
Some effort was made to make the documentation more consistent between types as well.
Fix with_capacity* methods for Vec
Fix *reserve* methods for Vec
Fix docs for *reserve* methods of VecDeque
Fix docs for String::with_capacity
Fix docs for *reserve* methods of String
Fix docs for OsString::with_capacity
Fix docs for *reserve* methods on OsString
Fix docs for with_capacity* methods on HashSet
Fix docs for *reserve methods of HashSet
Fix docs for with_capacity* methods of HashMap
Fix docs for *reserve methods on HashMap
Fix expect messages about OOM in doctests
Fix docs for BinaryHeap::with_capacity
Fix docs for *reserve* methods of BinaryHeap
Fix typos
Fix docs for with_capacity on BufWriter and LineWriter
Fix consistent use of `hasher` between `HashMap` and `HashSet`
Fix warning in doc test
Add test for capacity of vec with ZST
Fix doc test error
Avoid zero-sized allocs in ThinBox if T and H are both ZSTs.
This was surprisingly tricky, and took longer to get right than expected. `ThinBox` is a surprisingly subtle piece of code. That said, in the end, a lot of this was due to overthinking[^overthink] -- ultimately the fix ended up fairly clean and simple.
[^overthink]: Honestly, for a while I was convinced this couldn't be done without allocations or runtime branches in these cases, but that's obviously untrue.
Anyway, as a result of spending all that time debugging, I've extended the tests quite a bit, and also added more debug assertions. Many of these helped for subtle bugs I made in the middle (for example, the alloc/drop tracking is because I ended up double-dropping the value in the case where both were ZSTs), they're arguably a bit of overkill at this point, although I imagine they could help in the future too.
Anyway, these tests cover a wide range of size/align cases, nd fully pass under miri[^1]. They also do some smoke-check asserting that the value has the correct alignment, although in practice it's totally within the compiler's rights to delete these assertions since we'd have already done UB if they get hit. They have more boilerplate than they really need, but it's not *too* bad on a per-test basis.
A notable absence from testing is atypical header types, but at the moment it's impossible to manually implement `Pointee`. It would be really nice to have testing here, since it's not 100% obvious to me that the aligned read/write we use for `H` are correct in the face of arbitrary combinations of `size_of::<H>()`, `align_of::<H>()`, and `align_of::<T>()`. (That said, I spent a while thinking through it and am *pretty* sure it's fine -- I'd just feel... better if we could test some cases for non-ZST headers which have unequal and align).
[^1]: Or at least, they pass under miri if I copy the code and tests into a new crate and run miri on it (after making it less stdlibified).
Fixes#96485.
I'd request review ``@yaahc,`` but I believe you're taking some time away from reviews, so I'll request from the previous PR's reviewer (I think that the context helps, even if the actual change didn't end up being bad here).
r? ``@joshtriplett``
Classify BinaryHeap & LinkedList unit tests as such
All but one of these so-called integration test case are unit tests, just like btree's were (#75531). In addition, reunite the unit tests of linked_list that were split off during #23104 because they needed to remain unit tests (they were later moved to the separate file they are in during #63207). The two sets could remain separate files, but I opted to merge them back together, more or less in the order they used to be, apart from one duplicate name `test_split_off` and one duplicate tiny function `list_from`.
Relevant commit messages from squashed history in order:
Add initial version of ThinBox
update test to actually capture failure
swap to middle ptr impl based on matthieu-m's design
Fix stack overflow in debug impl
The previous version would take a `&ThinBox<T>` and deref it once, which
resulted in a no-op and the same type, which it would then print causing
an endless recursion. I've switched to calling `deref` by name to let
method resolution handle deref the correct number of times.
I've also updated the Drop impl for good measure since it seemed like it
could be falling prey to the same bug, and I'll be adding some tests to
verify that the drop is happening correctly.
add test to verify drop is behaving
add doc examples and remove unnecessary Pointee bounds
ThinBox: use NonNull
ThinBox: tests for size
Apply suggestions from code review
Co-authored-by: Alphyr <47725341+a1phyr@users.noreply.github.com>
use handle_alloc_error and fix drop signature
update niche and size tests
add cfg for allocating APIs
check null before calculating offset
add test for zst and trial usage
prevent optimizer induced ub in drop and cleanup metadata gathering
account for arbitrary size and alignment metadata
Thank you nika and thomcc!
Update library/alloc/src/boxed/thin.rs
Co-authored-by: Josh Triplett <josh@joshtriplett.org>
Update library/alloc/src/boxed/thin.rs
Co-authored-by: Josh Triplett <josh@joshtriplett.org>
Fix double drop of allocator in IntoIter impl of Vec
Fixes#95269
The `drop` impl of `IntoIter` reconstructs a `RawVec` from `buf`, `cap` and `alloc`, when that `RawVec` is dropped it also drops the allocator. To avoid dropping the allocator twice we wrap it in `ManuallyDrop` in the `InttoIter` struct.
Note this is my first contribution to the standard library, so I might be missing some details or a better way to solve this.
Use modern formatting for format! macros
This updates the standard library's documentation to use the new format_args syntax.
The documentation is worthwhile to update as it should be more idiomatic
(particularly for features like this, which are nice for users to get acquainted
with). The general codebase is likely more hassle than benefit to update: it'll
hurt git blame, and generally updates can be done by folks updating the code if
(and when) that makes things more readable with the new format.
A few places in the compiler and library code are updated (mostly just due to
already having been done when this commit was first authored).
`eprintln!("{}", e)` becomes `eprintln!("{e}")`, but `eprintln!("{}", e.kind())` remains untouched.
This updates the standard library's documentation to use the new syntax. The
documentation is worthwhile to update as it should be more idiomatic
(particularly for features like this, which are nice for users to get acquainted
with). The general codebase is likely more hassle than benefit to update: it'll
hurt git blame, and generally updates can be done by folks updating the code if
(and when) that makes things more readable with the new format.
A few places in the compiler and library code are updated (mostly just due to
already having been done when this commit was first authored).
Implement `panic::update_hook`
Add a new function `panic::update_hook` to allow creating panic hooks that forward the call to the previously set panic hook, without race conditions. It works by taking a closure that transforms the old panic hook into a new one, while ensuring that during the execution of the closure no other thread can modify the panic hook. This is a small function so I hope it can be discussed here without a formal RFC, however if you prefer I can write one.
Consider the following example:
```rust
let prev = panic::take_hook();
panic::set_hook(Box::new(move |info| {
println!("panic handler A");
prev(info);
}));
```
This is a common pattern in libraries that need to do something in case of panic: log panic to a file, record code coverage, send panic message to a monitoring service, print custom message with link to github to open a new issue, etc. However it is impossible to avoid race conditions with the current API, because two threads can execute in this order:
* Thread A calls `panic::take_hook()`
* Thread B calls `panic::take_hook()`
* Thread A calls `panic::set_hook()`
* Thread B calls `panic::set_hook()`
And the result is that the original panic hook has been lost, as well as the panic hook set by thread A. The resulting panic hook will be the one set by thread B, which forwards to the default panic hook. This is not considered a big issue because the panic handler setup is usually run during initialization code, probably before spawning any other threads.
Using the new `panic::update_hook` function, this race condition is impossible, and the result will be either `A, B, original` or `B, A, original`.
```rust
panic::update_hook(|prev| {
Box::new(move |info| {
println!("panic handler A");
prev(info);
})
});
```
I found one real world use case here: 988cf403e7/src/detection.rs (L32) the workaround is to detect the race condition and panic in that case.
The pattern of `take_hook` + `set_hook` is very common, you can see some examples in this pull request, so I think it's natural to have a function that combines them both. Also using `update_hook` instead of `take_hook` + `set_hook` reduces the number of calls to `HOOK_LOCK.write()` from 2 to 1, but I don't expect this to make any difference in performance.
### Unresolved questions:
* `panic::update_hook` takes a closure, if that closure panics the error message is "panicked while processing panic" which is not nice. This is a consequence of holding the `HOOK_LOCK` while executing the closure. Could be avoided using `catch_unwind`?
* Reimplement `panic::set_hook` as `panic::update_hook(|_prev| hook)`?
Allow reverse iteration of lowercase'd/uppercase'd chars
The PR implements `DoubleEndedIterator` trait for `ToLowercase` and `ToUppercase`.
This enables reverse iteration of lowercase/uppercase variants of character sequences.
One of use cases: determining whether a char sequence is a suffix of another one.
Example:
```rust
fn endswith_ignore_case(s1: &str, s2: &str) -> bool {
for eob in s1
.chars()
.flat_map(|c| c.to_lowercase())
.rev()
.zip_longest(s2.chars().flat_map(|c| c.to_lowercase()).rev())
{
match eob {
EitherOrBoth::Both(c1, c2) => {
if c1 != c2 {
return false;
}
}
EitherOrBoth::Left(_) => return true,
EitherOrBoth::Right(_) => return false,
}
}
true
}
```
Make split_inclusive() on an empty slice yield an empty output
`[].split_inclusive()` currently yields a single, empty slice. That's
different from `"".split_inslusive()`, which yields no output at
all. I think that makes the slice version harder to use.
The case where I ran into this bug was when writing code for
generating a diff between two slices of bytes. I wanted to prefix
removed lines with "-" and a added lines with "+". Due to
`split_inclusive()`'s current behavior, that means that my code prints
just a "-" or "+" for empty files. I suspect most existing callers
have similar "bugs" (which would be fixed by this patch).
Closes#89716.
The `advance_by(n)` docs state that in the error case `Err(k)` that k is always less than n.
It also states that `advance_by(0)` may return `Err(0)` to indicate an exhausted iterator.
These statements are inconsistent.
Since only one implementation (Skip) actually made use of that I changed it to return Ok(()) in that case too.
While adding some tests I also found a bug in `Take::advance_back_by`.
This commit makes the following functions from `core::str` `const fn`:
- `from_utf8[_mut]` (`feature(const_str_from_utf8)`)
- `from_utf8_unchecked_mut` (`feature(const_str_from_utf8_unchecked_mut)`)
- `Utf8Error::{valid_up_to,error_len}` (`feature(const_str_from_utf8)`)
`[].split_inclusive()` currently yields a single, empty slice. That's
different from `"".split_inslusive()`, which yields no output at
all. I think that makes the slice version harder to use.
The case where I ran into this bug was when writing code for
generating a diff between two slices of bytes. I wanted to prefix
removed lines with "-" and a added lines with "+". Due to
`split_inclusive()`'s current behavior, that means that my code prints
just a "-" or "+" for empty files. I suspect most existing callers
have similar "bugs" (which would be fixed by this patch).
Closes#89716.
BTree: remove Ord bound from new
`K: Ord` bound is unnecessary on `BTree{Map,Set}::new` and their `Default` impl. No elements exist so there are nothing to compare anyway, so I don't think "future proof" would be a blocker here. This is analogous to `HashMap::new` not having a `K: Eq + Hash` bound.
#79245 originally does this and for some reason drops the change to `new` and `Default`. I can see why changes to other methods like `entry` or `symmetric_difference` need to be careful but I couldn't find out any reason not to do it on `new`.
Removing the bound also makes the stabilisation of `const fn new` not depending on const trait bounds.
cc `@steffahn` who suggests me to make this PR.
r? `@dtolnay`
The libs-api team agrees to allow const_trait_impl to appear in the
standard library as long as stable code cannot be broken (they are
properly gated) this means if the compiler teams thinks it's okay, then
it's okay.
My priority on constifying would be:
1. Non-generic impls (e.g. Default) or generic impls with no
bounds
2. Generic functions with bounds (that use const impls)
3. Generic impls with bounds
4. Impls for traits with associated types
For people opening constification PRs: please cc me and/or oli-obk.
Assign FIXMEs to me and remove obsolete ones
Also fixed capitalization of documentation
We also don't need to transform predicates to be non-const since we basically ignore const predicates in non-const contexts.
r? `````@oli-obk`````
Test and fix `size_hint` for slice’s [r]split* iterators
Adds extensive test (of `size_hint`) for all the _[r]split*_ iterators.
Fixes `size_hint` upper bound for _split_inclusive*_ iterators which was one higher than necessary for non-empty slices.
Fixes `size_hint` lower bound for _[r]splitn*_ iterators when _n == 0_, which was one too high.
**Lower bound being one too high was a logic error, violating the correctness condition of `size_hint`.**
_Edit:_ I’ve opened an issue for that bug, so this PR fixes#87978
Adds extensive test for all the [r]split* iterators.
Fixes size_hint upper bound for split_inclusive* iterators which was one higher than necessary for non-empty slices.
Fixes size_hint lower bound for [r]splitn* iterators when n==0, which was one too high.
Hide allocator details from TryReserveError
I think there's [no need for TryReserveError to carry detailed information](https://github.com/rust-lang/rust/issues/48043#issuecomment-825139280), but I wouldn't want that issue to delay stabilization of the `try_reserve` feature.
So I'm proposing to stabilize `try_reserve` with a `TryReserveError` as an opaque structure, and if needed, expose error details later.
This PR moves the `enum` to an unstable inner `TryReserveErrorKind` that lives under a separate feature flag. `TryReserveErrorKind` could possibly be left as an implementation detail forever, and the `TryReserveError` get methods such as `allocation_size() -> Option<usize>` or `layout() -> Option<Layout>` instead, or the details could be dropped completely to make try-reserve errors just a unit struct, and thus smaller and cheaper.
Update standard library for IntoIterator implementation of arrays
This PR partially resolves issue #84513 of updating the standard library part.
I haven't found any remaining doctest examples which are using iterators over e.g. &i32 instead of just i32 in the standard library. Can anyone point me to them if there's remaining any?
Thanks!
r? ```@m-ou-se```
This also checks the contents and not only the capacity in case IntoIter's clone implementation is changed to add capacity at the end. Extra capacity at the beginning would be needed to make InPlaceIterable work.
Co-authored-by: Giacomo Stevanato <giaco.stevanato@gmail.com>
The unsoundness is not in Peekable per se, it rather is due to the
interaction between Peekable being able to hold an extra item
and vec::IntoIter's clone implementation shortening the allocation.
An alternative solution would be to change IntoIter's clone implementation
to keep enough spare capacity available.
Fix double-drop in `Vec::from_iter(vec.into_iter())` specialization when items drop during panic
This fixes the double-drop but it leaves a behavioral difference compared to the default implementation intact: In the default implementation the source and the destination vec are separate objects, so they get dropped separately. Here they share an allocation and the latter only exists as a pointer into the former. So if dropping the former panics then this fix will leak more items than the default implementation would. Is this acceptable or should the specialization also mimic the default implementation's drops-during-panic behavior?
Fixes#83618
`@rustbot` label T-libs-impl
alloc: Added `as_slice` method to `BinaryHeap` collection
I initially asked about whether it is useful addition on https://internals.rust-lang.org/t/should-i-add-as-slice-method-to-binaryheap/13816, and it seems there were no objections, so went ahead with this PR.
> There is [`BinaryHeap::into_vec`](https://doc.rust-lang.org/std/collections/struct.BinaryHeap.html#method.into_vec), but it consumes the value. I wonder if there is API design limitation that should be taken into account. Implementation-wise, the inner buffer is just a Vec, so it is trivial to expose as_slice from it.
Please, guide me through if I need to add tests or something else.
UPD: Tracking issue #83659
Add IEEE 754 compliant fmt/parse of -0, infinity, NaN
This pull request improves the Rust float formatting/parsing libraries to comply with IEEE 754's formatting expectations around certain special values, namely signed zero, the infinities, and NaN. It also adds IEEE 754 compliance tests that, while less stringent in certain places than many of the existing flt2dec/dec2flt capability tests, are intended to serve as the beginning of a roadmap to future compliance with the standard. Some relevant documentation is also adjusted with clarifying remarks.
This PR follows from discussion in https://github.com/rust-lang/rfcs/issues/1074, and closes#24623.
The most controversial change here is likely to be that -0 is now printed as -0. Allow me to explain: While there appears to be community support for an opt-in toggle of printing floats as if they exist in the naively expected domain of numbers, i.e. not the extended reals (where floats live), IEEE 754-2019 is clear that a float converted to a string should be capable of being transformed into the original floating point bit-pattern when it satisfies certain conditions (namely, when it is an actual numeric value i.e. not a NaN and the original and destination float width are the same). -0 is given special attention here as a value that should have its sign preserved. In addition, the vast majority of other programming languages not only output `-0` but output `-0.0` here.
While IEEE 754 offers a broad leeway in how to handle producing what it calls a "decimal character sequence", it is clear that the operations a language provides should be capable of round tripping, and it is confusing to advertise the f32 and f64 types as binary32 and binary64 yet have the most basic way of producing a string and then reading it back into a floating point number be non-conformant with the standard. Further, existing documentation suggested that e.g. -0 would be printed with -0 regardless of the presence of the `+` fmt character, but it prints "+0" instead if given such (which was what led to the opening of #24623).
There are other parsing and formatting issues for floating point numbers which prevent Rust from complying with the standard, as well as other well-documented challenges on the arithmetic level, but I hope that this can be the beginning of motion towards solving those challenges.
Fixes#83046
The program
fn main() {
println!("{:?}", '"');
println!("{:?}", "'");
}
would previously print
'\"'
"\'"
With this patch it now prints:
'"'
"'"