Rollup of 8 pull requests
Successful merges:
- #106904 (Preserve split DWARF files when building archives.)
- #106971 (Handle diagnostics customization on the fluent side (for one specific diagnostic))
- #106978 (Migrate mir_build's borrow conflicts)
- #107150 (`ty::tls` cleanups)
- #107168 (Use a type-alias-impl-trait in `ObligationForest`)
- #107189 (Encode info for Adt in a single place.)
- #107322 (Custom mir: Add support for some remaining, easy to support constructs)
- #107323 (Disable ConstGoto opt in cleanup blocks)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
Custom mir: Add support for some remaining, easy to support constructs
Some documentation for previous changes and support for `Deinit`, checked binops, len, and array repetition
r? ```@oli-obk``` or ```@tmiasko```
Move format_args!() into AST (and expand it during AST lowering)
Implements https://github.com/rust-lang/compiler-team/issues/541
This moves FormatArgs from rustc_builtin_macros to rustc_ast_lowering. For now, the end result is the same. But this allows for future changes to do smarter things with format_args!(). It also allows Clippy to directly access the ast::FormatArgs, making things a lot easier.
This change turns the format args types into lang items. The builtin macro used to refer to them by their path. After this change, the path is no longer relevant, making it easier to make changes in `core`.
This updates clippy to use the new language items, but this doesn't yet make clippy use the ast::FormatArgs structure that's now available. That should be done after this is merged.
Avoid __cxa_thread_atexit_impl on Emscripten
- Fixes https://github.com/rust-lang/rust/issues/91628.
- Fixes https://github.com/emscripten-core/emscripten/issues/15722.
See discussion in both issues.
The TL;DR is that weak linkage causes LLVM to produce broken Wasm, presumably due to pointer mismatch. The code is casting a void pointer to a function pointer with specific signature, but Wasm is very strict about function pointer compatibility, so the resulting code is invalid.
Ideally LLVM should catch this earlier in the process rather than emit invalid Wasm, but it currently doesn't and this is an easy and valid fix, given that Emcripten doesn't have `__cxa_thread_atexit_impl` these days anyway.
Unfortunately, I can't add a regression test as even after looking into this issue for a long time, I couldn't reproduce it with any minimal Rust example, only with extracted LLVM IR or on a large project involving Rust + C++.
impl DispatchFromDyn for Cell and UnsafeCell
After some fruitful discussion on [Internals](https://internals.rust-lang.org/t/impl-dispatchfromdyn-for-cell-2/16520) here's my first PR to rust-lang/rust 🎉
Please let me know if there's something I missed.
This adds `DispatchFromDyn` impls for `Cell`, `UnsafeCell` and `SyncUnsafeCell`.
An existing test is also expanded to test the `Cell` impl (which requires the `UnsafeCell` impl)
The different `RefCell` types can not implement `DispatchFromDyn` since they have more than one (non ZST) field.
**Edit:**
### What:
These changes allow one to make types like `MyRc`(code below), to be object safe method receivers after implementing `DispatchFromDyn` and `Deref` for them.
This allows for code like this:
```rust
struct MyRc<T: ?Sized>(Cell<NonNull<RcBox<T>>>);
/* impls for DispatchFromDyn, CoerceUnsized and Deref for MyRc*/
trait Trait {
fn foo(self: MyRc<Self>);
}
let impls_trait = ...;
let rc = MyRc::new(impls_trait) as MyRc<dyn Trait>;
rc.foo();
```
Note: `Cell` and `UnsafeCell` won't directly become valid method receivers since they don't implement `Deref`. Making use of these changes requires a wrapper type and nightly features.
### Why:
A custom pointer type with interior mutability allows one to store extra information in the pointer itself.
These changes allow for such a type to be a method receiver.
### Examples:
My use case is a cycle aware custom `Rc` implementation that when dropping a cycle marks some references dangling.
On the [forum](https://internals.rust-lang.org/t/impl-dispatchfromdyn-for-cell/14762/8) andersk mentioned that they track if a `Gc` reference is rooted with an extra bit in the reference itself.
Suggest using a lock for `*Cell: Sync` bounds
I mostly did this for `OnceCell<T>` at first because users will be confused to see that the `OnceCell<T>` in `std` isn't `Sync` but then extended it to `Cell<T>` and `RefCell<T>` as well.
`sub_ptr()` is equivalent to `usize::try_from().unwrap_unchecked()`, not `usize::from().unwrap_unchecked()`
`usize::from()` gives a `usize`, not `Result<usize>`, and `usize: From<isize>` is not implemented.
Allow fmt::Arguments::as_str() to return more Some(_).
This adjusts the documentation to allow optimization of format_args!() to be visible through fmt::Arguments::as_str().
This allows for future changes like https://github.com/rust-lang/rust/pull/106824.
Allow setting CFG_DISABLE_UNSTABLE_FEATURES to 0
Two locations check whether this build-time environment variable is defined. Allowing it to be explicitly disabled with a "0" value is useful, especially for integrating with external build systems.
Before this change, the following functions and macros were annotated
with `#[cfg(target_has_atomic = "8")]` or
`#[cfg(target_has_atomic_load_store = "8")]`:
* `atomic_int`
* `strongest_failure_ordering`
* `atomic_swap`
* `atomic_add`
* `atomic_sub`
* `atomic_compare_exchange`
* `atomic_compare_exchange_weak`
* `atomic_and`
* `atomic_nand`
* `atomic_or`
* `atomic_xor`
* `atomic_max`
* `atomic_min`
* `atomic_umax`
* `atomic_umin`
However, none of those functions and macros actually depend on 8-bit
width and they are needed for all atomic widths (16-bit, 32-bit, 64-bit
etc.). Some targets might not support 8-bit atomics (i.e. BPF, if we
would enable atomic CAS for it).
This change fixes that by removing the `"8"` argument from annotations,
which results in accepting the whole variety of widths.
Fixes#106845Fixes#106795
Signed-off-by: Michal Rostecki <vadorovsky@gmail.com>
Add `Arc::into_inner` for safely discarding `Arc`s without calling the destructor on the inner type.
ACP: rust-lang/libs-team#162
Reviving #79665.
I want to get this merged this time; this does not contain changes (apart from very minor changes in comments/docs).
See #79665 for further description of the PR. The only “unresolved” points that led to that PR being closed, AFAICT, were
* The desire to also implement a `Rc::into_inner` function
* however, this can very well also happen as a subsequent PR
* Possible need for further discussion on the naming “`into_inner`” (?)
* `into_inner` seems fine to me; also, this PR introduces unstable API, and names can be changed later, too
* ~~I don't know if a tracking issue for the feature flag is supposed to be opened before or after this PR gets merged (if *before*, then I can add the issue number to the `#[unstable…]` attribute)~~ There is a [tracking issue](https://github.com/rust-lang/rust/issues/106894) now.
I say “unresolved” in quotation marks because from my point of view, if reviewers agree, the PR can be merged immediately and as-is :-)
Do not use box syntax in `std`
See #94970 and #49733. About half of the `box` instances in `std` do not even need to allocate, the other half can simply be replaced with `Box::new`.
`@rustbot` label +T-libs
r? rust-lang/libs
Add note about absolute paths to Path::join
The note already exists on `PathBuf::push`, but I think it is good to have it on `Path::join` as well since it can cause issues if you are not careful with your input.
Improve the documentation of `black_box`
There don't seem to be many great resources on how `black_box` should be used, so I added some information here
Unify stable and unstable sort implementations in same core module
This moves the stable sort implementation to the core::slice::sort module. By virtue of being in core it can't access `Vec`. The two `Vec` used by merge sort, `buf` and `runs`, are modelled as custom types that implement the very limited required `Vec` interface with the help of provided allocation and free functions. This is done to allow future re-use of functions and logic between stable and unstable sort. Such as `insert_head`.
This is in preparation of #100856 and #104116. It only moves code, it *doesn't* change any of the sort related logic. This unlocks the ability to share `insert_head`, `insert_tail`, `swap_if_less` `merge` and more.
Tagging ````@Mark-Simulacrum```` I hope this allows progress on #100856, by moving `merge_sort` here I hope future changes will be easier to review.
Rollup of 5 pull requests
Successful merges:
- #105977 (Transform async `ResumeTy` in generator transform)
- #106927 (make `CastError::NeedsDeref` create a `MachineApplicable` suggestion)
- #106931 (document + UI test `E0208` and make its output more user-friendly)
- #107027 (Remove extra removal from test path)
- #107037 (Fix Dominators::rank_partial_cmp to match documentation)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
Implement `alloc::vec::IsZero` for `Option<$NUM>` types
Fixes#106911
Mirrors the `NonZero$NUM` implementations with an additional `assert_zero_valid`.
`None::<i32>` doesn't stricly satisfy `IsZero` but for the purpose of allocating we can produce more efficient codegen.
- Eliminates all the `get_context` calls that async lowering created.
- Replace all `Local` `ResumeTy` types with `&mut Context<'_>`.
The `Local`s that have their types replaced are:
- The `resume` argument itself.
- The argument to `get_context`.
- The yielded value of a `yield`.
The `ResumeTy` hides a `&mut Context<'_>` behind an unsafe raw pointer, and the
`get_context` function is being used to convert that back to a `&mut Context<'_>`.
Ideally the async lowering would not use the `ResumeTy`/`get_context` indirection,
but rather directly use `&mut Context<'_>`, however that would currently
lead to higher-kinded lifetime errors.
See <https://github.com/rust-lang/rust/issues/105501>.
The async lowering step and the type / lifetime inference / checking are
still using the `ResumeTy` indirection for the time being, and that indirection
is removed here. After this transform, the generator body only knows about `&mut Context<'_>`.
Don't do pointer arithmetic on pointers to deallocated memory
vec::Splice can invalidate the slice::Iter inside vec::Drain. So we replace them with dangling pointers which, unlike ones to deallocated memory, are allowed.
Fixes miri test failures.
Fixes https://github.com/rust-lang/miri/issues/2759
Lift `T: Sized` bounds from some `strict_provenance` pointer methods
This PR removes requirement for `T` (pointee type) to be `Sized` to call `pointer::{addr, expose_addr, with_addr, map_addr}`. These functions don't use `T`'s size, so there is no reason for them to require this. Updated public API:
cc ``@Gankra,`` #95228
r? libs-api
Add heapsort fallback in `select_nth_unstable`
Addresses #102451 and #106933.
`slice::select_nth_unstable` uses a quick select implementation based on the same pattern defeating quicksort algorithm that `slice::sort_unstable` uses. `slice::sort_unstable` uses a recursion limit and falls back to heapsort if there were too many bad pivot choices, to ensure O(n log n) worst case running time (known as introsort). However, `slice::select_nth_unstable` does not have such a fallback strategy, which leads to it having a worst case running time of O(n²) instead. #102451 links to a playground which generates pathological inputs that show this quadratic behavior. On my machine, a randomly generated slice of length `1 << 19` takes ~200µs to calculate its median, whereas a pathological input of the same length takes over 2.5s. This PR adds an iteration limit to `select_nth_unstable`, falling back to heapsort, which ensures an O(n log n) worst case running time (introselect). With this change, there was no noticable slowdown for the random input, but the same pathological input now takes only ~1.2ms. In the future it might be worth implementing something like Median of Medians or Fast Deterministic Selection instead, which guarantee O(n) running time for all possible inputs. I've left this as a `FIXME` for now and only implemented the heapsort fallback to minimize the needed code changes.
I still think we should clarify in the `select_nth_unstable` docs that the worst case running time isn't currently O(n) (the original reason that #102451 was opened), but I think it's a lot better to be able to guarantee O(n log n) instead of O(n²) for the worst case.
vec::Splice can invalidate the slice::Iter inside vec::Drain.
So we replace them with dangling pointers which, unlike ones to
deallocated memory, are allowed.
Simplify manual ptr arithmetic in slice::Iter with ptr_sub
The old code was introduced in #61885, which predates the ptr_sub method and underlying intrinsic. The codegen test still passes.
r? `@scottmcm`
Leak amplification for peek_mut() to ensure BinaryHeap's invariant is always met
In the libs-api team's discussion around #104210, some of the team had hesitations around exposing malformed BinaryHeaps of an element type whose Ord and Drop impls are trusted, and which does not contain interior mutability.
For example in the context of this kind of code:
```rust
use std::collections::BinaryHeap;
use std::ops::Range;
use std::slice;
fn main() {
let slice = &mut ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'];
let cut_points = BinaryHeap::from(vec![4, 2, 7]);
println!("{:?}", chop(slice, cut_points));
}
// This is a souped up slice::split_at_mut to split in arbitrary many places.
//
// usize's Ord impl is trusted, so 1 single bounds check guarantees all those
// output slices are non-overlapping and in-bounds
fn chop<T>(slice: &mut [T], mut cut_points: BinaryHeap<usize>) -> Vec<&mut [T]> {
let mut vec = Vec::with_capacity(cut_points.len() + 1);
let max = match cut_points.pop() {
Some(max) => max,
None => {
vec.push(slice);
return vec;
}
};
assert!(max <= slice.len());
let len = slice.len();
let ptr: *mut T = slice.as_mut_ptr();
let get_unchecked_mut = unsafe {
|range: Range<usize>| &mut *slice::from_raw_parts_mut(ptr.add(range.start), range.len())
};
vec.push(get_unchecked_mut(max..len));
let mut end = max;
while let Some(start) = cut_points.pop() {
vec.push(get_unchecked_mut(start..end));
end = start;
}
vec.push(get_unchecked_mut(0..end));
vec
}
```
```console
[['7', '8', '9'], ['4', '5', '6'], ['2', '3'], ['0', '1']]
```
In the current BinaryHeap API, `peek_mut()` is the only thing that makes the above function unsound.
```rust
let slice = &mut ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'];
let mut cut_points = BinaryHeap::from(vec![4, 2, 7]);
{
let mut max = cut_points.peek_mut().unwrap();
*max = 0;
std::mem::forget(max);
}
println!("{:?}", chop(slice, cut_points));
```
```console
[['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'], [], ['2', '3'], ['0', '1']]
```
Or worse:
```rust
let slice = &mut ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'];
let mut cut_points = BinaryHeap::from(vec![100, 100]);
{
let mut max = cut_points.peek_mut().unwrap();
*max = 0;
std::mem::forget(max);
}
println!("{:?}", chop(slice, cut_points));
```
```console
[['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'], [], ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '\u{1}', '\0', '?', '翾', '?', '翾', '\0', '\0', '?', '翾', '?', '翾', '?', '啿', '?', '啿', '?', '啿', '?', '啿', '?', '啿', '?', '翾', '\0', '\0', '', '啿', '\u{5}', '\0', '\0', '\0', '\0', '\0', '\0', '\0', '\0', '\0', '\0', '\0', '\0', '\0', '\u{8}', '\0', '`@',` '\0', '\u{1}', '\0', '?', '翾', '?', '翾', '?', '翾', '
thread 'main' panicked at 'index out of bounds: the len is 33 but the index is 33', library/core/src/unicode/unicode_data.rs:319:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
```
---
This PR makes `peek_mut()` use leak amplification (https://doc.rust-lang.org/1.66.0/nomicon/leaking.html#drain) to preserve the heap's invariant even in the situation that `PeekMut` gets leaked.
I'll also follow up in the tracking issue of unstable `drain_sorted()` (#59278) and `retain()` (#71503).
Fix the stability attributes for `std::os::fd`.
As `@bjorn3` pointed out [here], I used the wrong stability attribute in #98368 when making `std::os::fd` public. I set it to Rust 1.63, which was when io-safety was stabilized, but it should be Rust 1.66, which was when `std::os::fd` was stabilized.
[here]: https://github.com/rust-lang/rust/pull/98368#discussion_r1063721420
As @bjorn3 pointed out [here], I used the wrong stability attribute in #98368
when making `std::os::fd` public. I set it to Rust 1.63, which was when
io-safety was stabilized, but it should be Rust 1.66, which was when
`std::os::fd` was stabilized.
[here]: https://github.com/rust-lang/rust/pull/98368#discussion_r1063721420
Remove various double spaces in the libraries.
I was just pretty bothered by this when reading the source for a function, and was suggested to check if this happened elsewhere.
Stop probing for statx unless necessary
As is the current toy program:
fn main() -> std::io::Result<()> {
use std::fs;
let metadata = fs::metadata("foo.txt")?;
assert!(!metadata.is_dir());
Ok(())
}
... observed under strace will issue:
[snip]
statx(0, NULL, AT_STATX_SYNC_AS_STAT, STATX_ALL, NULL) = -1 EFAULT (Bad address) statx(AT_FDCWD, "foo.txt", AT_STATX_SYNC_AS_STAT, STATX_ALL, {stx_mask=STATX_ALL|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFREG|0644, stx_size=0, ...}) = 0
While statx is not necessarily always present, checking for it can be delayed to the first error condition. Said condition may very well never happen, in which case the check got avoided altogether.
Note this is still suboptimal as there still will be programs issuing it, but bulk of the problem is removed.
Tested by forbidding the syscall for the binary and observing it correctly falls back to newfstatat.
While here tidy up the commentary, in particular by denoting some problems with the current approach.
mv binary_heap.rs binary_heap/mod.rs
I confess this request is somewhat selfish, as it's made in order to ease synchronisation with my [copse](https://crates.io/crates/copse) crate (see eggyal/copse#6 for explanation). I wholly understand that such grounds may be insufficient to justify merging this request—but no harm in asking, right?
reword Option::as_ref and Option::map examples
The description for the examples of `Option::as_ref` and `Option::map` imply that the example is only doing type conversion, when it is actually finding the length of a string.
Changes the wording to imply that some operation is being run on the value contained in the `Option`
closes#104476
[LSDA] Take ttype_index into account when taking unwind action
If `cs_action != 0`, we should check the `ttype_index` field in action record. If `ttype_index == 0`, a clean up action is taken; otherwise catch action is taken.
This can fix unwind failure on AIX which uses LLVM's libunwind by default. IIUC, rust's LSDA is borrowed from c++ and I'm assuming itanium-cxx-abi https://itanium-cxx-abi.github.io/cxx-abi/exceptions.pdf should be followed, so the fix follows what libcxxabi does. See ec48682ce9/libcxxabi/src/cxa_personality.cpp (L152) for use of `ttype_index`.
- Fixes https://github.com/rust-lang/rust/issues/91628.
- Fixes https://github.com/emscripten-core/emscripten/issues/15722.
See discussion in both issues.
The TL;DR is that weak linkage causes LLVM to produce broken Wasm, presumably due to pointer mismatch. The code is casting a void pointer to a function pointer with specific signature, but Wasm is very strict about function pointer compatibility, so the resulting code is invalid.
Ideally LLVM should catch this earlier in the process rather than emit invalid Wasm, but it currently doesn't and this is an easy and valid fix, given that Emcripten doesn't have `__cxa_thread_atexit_impl` these days anyway.
Unfortunately, I can't add a regression test as even after looking into this issue for a long time, I couldn't reproduce it with any minimal Rust example, only with extracted LLVM IR or on a large project involving Rust + C++.
r? @alexcrichton
Two locations check whether this build-time environment variable is
defined. Allowing it to be explicitly disabled with a "0" value is
useful, especially for integrating with external build systems.