liballoc: introduce String, Vec const-slicing
This change `const`-qualifies many methods on `Vec` and `String`, notably `as_slice`, `as_str`, `len`. These changes are made behind the unstable feature flag `const_vec_string_slice`.
## Motivation
This is to support simultaneous variance over ownership and constness. I have an enum type that may contain either `String` or `&str`, and I want to produce a `&str` from it in a possibly-`const` context.
```rust
enum StrOrString<'s> {
Str(&'s str),
String(String),
}
impl<'s> StrOrString<'s> {
const fn as_str(&self) -> &str {
match self {
// In a const-context, I really only expect to see this variant, but I can't switch the implementation
// in some mode like #[cfg(const)] -- there has to be a single body
Self::Str(s) => s,
// so this is a problem, since it's not `const`
Self::String(s) => s.as_str(),
}
}
}
```
Currently `String` and `Vec` don't support this, but can without functional changes. Similar logic applies for `len`, `capacity`, `is_empty`.
## Changes
The essential thing enabling this change is that `Unique::as_ptr` is `const`. This lets us convert `RawVec::ptr` -> `Vec::as_ptr` -> `Vec::as_slice` -> `String::as_str`.
I had to move the `Deref` implementations into `as_{str,slice}` because `Deref` isn't `#[const_trait]`, but I would expect this change to be invisible up to inlining. I moved the `DerefMut` implementations as well for uniformity.
This change `const`-qualifies many methods on Vec and String, notably
`as_slice`, `as_str`, `len`. These changes are made behind the unstable
feature flag `const_vec_string_slice` with the following tracking issue:
https://github.com/rust-lang/rust/issues/129041
Mark some more types as having insignificant dtor
These were caught by https://github.com/rust-lang/rust/pull/129864#issuecomment-2376658407, which is implementing a lint for some changes in drop order for temporaries in tail expressions.
Specifically, the destructors of `CString` and the bitpacked repr for `std::io::Error` are insignificant insofar as they don't have side-effects on things like locking or synchronization; they just free memory.
See some discussion on #89144 for what makes a drop impl "significant"
Improve autovectorization of to_lowercase / to_uppercase functions
Refactor the code in the `convert_while_ascii` helper function to make it more suitable for auto-vectorization and also process the full ascii prefix of the string. The generic case conversion logic will only be invoked starting from the first non-ascii character.
The runtime on a microbenchmark with a small ascii-only input decreases from ~55ns to ~18ns per iteration. The new implementation also reduces the amount of unsafe code and encapsulates all unsafe inside the helper function.
Fixes#123712
update `compiler-builtins` to 0.1.126
this requires the addition of a bootstrap variant of the new `naked_asm!` macro
r? `@tgross35`
extracted from https://github.com/rust-lang/rust/pull/128651
Since the stabilization in #127679 has reached stage0, 1.82-beta, we can
start using `&raw` freely, and even the soft-deprecated `ptr::addr_of!`
and `ptr::addr_of_mut!` can stop allowing the unstable feature.
I intentionally did not change any documentation or tests, but the rest
of those macro uses are all now using `&raw const` or `&raw mut` in the
standard library.
Refactor the code in the `convert_while_ascii` helper function to make
it more suitable for auto-vectorization and also process the full ascii
prefix of the string. The generic case conversion logic will only be
invoked starting from the first non-ascii character.
The runtime on microbenchmarks with ascii-only inputs improves between
1.5x for short and 4x for long inputs on x86_64 and aarch64.
The new implementation also encapsulates all unsafe inside the
`convert_while_ascii` function.
Fixes#123712
Add str.as_str() for easy Deref to string slices
Working with `Box<str>` is cumbersome, because in places like `iter.filter()` it can end up being `&Box<str>` or even `&&Box<str>`, and such type doesn't always get auto-dereferenced as expected.
Dereferencing such box to `&str` requires ugly syntax like `&**boxed_str` or `&***boxed_str`, with the exact amount of `*`s.
`Box<str>` is [not easily comparable with other string types](https://github.com/rust-lang/rust/pull/129852) via `PartialEq`. `Box<str>` won't work for lookups in types like `HashSet<String>`, because `Borrow<String>` won't take types like `&Box<str>`. OTOH `set.contains(s.as_str())` works nicely regardless of levels of indirection.
`String` has a simple solution for this: the `as_str()` method, and `Box<str>` should too.
Avoid re-validating UTF-8 in `FromUtf8Error::into_utf8_lossy`
Part of the unstable feature `string_from_utf8_lossy_owned` - #129436
Refactor `FromUtf8Error::into_utf8_lossy` to copy valid UTF-8 bytes into the buffer, avoiding double validation of bytes.
Add tests that mirror the `String::from_utf8_lossy` tests.
Refactor `into_utf8_lossy` to copy valid UTF-8 bytes into the buffer,
avoiding double validation of bytes.
Add tests that mirror the `String::from_utf8_lossy` tests
[Clippy] Get rid of most `std` `match_def_path` usage, swap to diagnostic items.
Part of https://github.com/rust-lang/rust-clippy/issues/5393.
This was going to remove all `std` paths, but `SeekFrom` has issues being cleanly replaced with a diagnostic item as the paths are for variants, which currently cannot be diagnostic items.
This also, as a last step, categories the paths to help with future path removals.
Add new_cyclic_in for Rc and Arc
Currently, new_cyclic_in does not exist for Rc and Arc. This is an oversight according to https://github.com/rust-lang/wg-allocators/issues/132.
This PR adds new_cyclic_in for Rc and Arc. The implementation is almost the exact same as new_cyclic with some small differences to make it allocator-specific. new_cyclic's implementation has been replaced with a call to `new_cyclic_in(data_fn, Global)`.
Remaining questions:
* ~~Is requiring Allocator to be Clone OK? According to https://github.com/rust-lang/wg-allocators/issues/88, Allocators should be cheap to clone. I'm just hesitant to add unnecessary constraints, though I don't see an obvious workaround for this function since many called functions in new_cyclic_in expect an owned Allocator. I see Allocator.by_ref() as an option, but that doesn't work on when creating Weak { ptr: init_ptr, alloc: alloc.clone() }, because the type of Weak then becomes Weak<T, &A> which is incompatible.~~ Fixed, thank you `@zakarumych!` This PR no longer requires the allocator to be Clone.
* Currently, new_cyclic_in's documentation is almost entirely copy-pasted from new_cyclic, with minor tweaks to make it more accurate (e.g. Rc<T> -> Rc<T, A>). The example section is removed to mitigate redundancy and instead redirects to cyclic_in. Is this appropriate?
* ~~The comments in new_cyclic_in (and much of the implementation) are also copy-pasted from new_cyclic. Would it be better to make a helper method new_cyclic_in_internal that both functions call, with either Global or the custom allocator? I'm not sure if that's even possible, since the internal method would have to return Arc<T, Global> and I don't know if it's possible to "downcast" that to an Arc<T>. Maybe transmute would work here?~~ Done, thanks `@zakarumych`
* Arc::new_cyclic is #[inline], but Rc::new_cyclic is not. Which is preferred?
* nit: does it matter where in the impl block new_cyclic_in is defined?
Implement feature `string_from_utf8_lossy_owned` for lossy conversion from `Vec<u8>` to `String` methods
Accepted ACP: https://github.com/rust-lang/libs-team/issues/116
Tracking issue: #129436
Implement feature for lossily converting from `Vec<u8>` to `String`
- Add `String::from_utf8_lossy_owned`
- Add `FromUtf8Error::into_utf8_lossy`
---
Related to #64727, but unsure whether to mark it "fixed" by this PR.
That issue partly asks for in-place replacement of the original allocation. We fulfill the other half of that request with these functions.
closes#64727
Stabilize `&mut` (and `*mut`) as well as `&Cell` (and `*const Cell`) in const
This stabilizes `const_mut_refs` and `const_refs_to_cell`. That allows a bunch of new things in const contexts:
- Mentioning `&mut` types
- Creating `&mut` and `*mut` values
- Creating `&T` and `*const T` values where `T` contains interior mutability
- Dereferencing `&mut` and `*mut` values (both for reads and writes)
The same rules as at runtime apply: mutating immutable data is UB. This includes mutation through pointers derived from shared references; the following is diagnosed with a hard error:
```rust
#[allow(invalid_reference_casting)]
const _: () = {
let mut val = 15;
let ptr = &val as *const i32 as *mut i32;
unsafe { *ptr = 16; }
};
```
The main limitation that is enforced is that the final value of a const (or non-`mut` static) may not contain `&mut` values nor interior mutable `&` values. This is necessary because the memory those references point to becomes *read-only* when the constant is done computing, so (interior) mutable references to such memory would be pretty dangerous. We take a multi-layered approach here to ensuring no mutable references escape the initializer expression:
- A static analysis rejects (interior) mutable references when the referee looks like it may outlive the current MIR body.
- To be extra sure, this static check is complemented by a "safety net" of dynamic checks. ("Dynamic" in the sense of "running during/after const-evaluation, e.g. at runtime of this code" -- in contrast to "static" which works entirely by looking at the MIR without evaluating it.)
- After the final value is computed, we do a type-driven traversal of the entire value, and if we find any `&mut` or interior-mutable `&` we error out.
- However, the type-driven traversal cannot traverse `union` or raw pointers, so there is a second dynamic check where if the final value of the const contains any pointer that was not derived from a shared reference, we complain. This is currently a future-compat lint, but will become an ICE in #128543. On the off-chance that it's actually possible to trigger this lint on stable, I'd prefer if we could make it an ICE before stabilizing const_mut_refs, but it's not a hard blocker. This part of the "safety net" is only active for mutable references since with shared references, it has false positives.
Altogether this should prevent people from leaking (interior) mutable references out of the const initializer.
While updating the tests I learned that surprisingly, this code gets rejected:
```rust
const _: Vec<i32> = {
let mut x = Vec::<i32>::new(); //~ ERROR destructor of `Vec<i32>` cannot be evaluated at compile-time
let r = &mut x;
let y = x;
y
};
```
The analysis that rejects destructors in `const` is very conservative when it sees an `&mut` being created to `x`, and then considers `x` to be always live. See [here](https://github.com/rust-lang/rust/issues/65394#issuecomment-541499219) for a longer explanation. `const_precise_live_drops` will solve this, so I consider this problem to be tracked by https://github.com/rust-lang/rust/issues/73255.
Cc `@rust-lang/wg-const-eval` `@rust-lang/lang`
Cc https://github.com/rust-lang/rust/issues/57349
Cc https://github.com/rust-lang/rust/issues/80384
Add `NonNull` convenience methods to `Box` and `Vec`
Implements the ACP: https://github.com/rust-lang/libs-team/issues/418.
The docs for the added methods are mostly copied from the existing methods that use raw pointers instead of `NonNull`.
I'm new to this "contributing to rustc" thing, so I'm sorry if I did something wrong. In particular, I don't know what the process is for creating a new unstable feature. Please advise me if I should do something. Thank you.