Add Linux-specific pidfd process extensions (take 2)
Continuation of #77168.
I addressed the following concerns from the original PR:
- make `CommandExt` and `ChildExt` sealed traits
- wrap file descriptors in `PidFd` struct representing ownership over the fd
- add `take_pidfd` to take the fd out of `Child`
- close fd when dropped
Tracking Issue: #82971
Move UnwindSafe, RefUnwindSafe, AssertUnwindSafe to core
They were previously only available in std::panic, not core::panic.
- https://doc.rust-lang.org/1.51.0/std/panic/trait.UnwindSafe.html
- https://doc.rust-lang.org/1.51.0/std/panic/trait.RefUnwindSafe.html
- https://doc.rust-lang.org/1.51.0/std/panic/struct.AssertUnwindSafe.html
Where this is relevant: trait objects! Inside a `#![no_std]` library it's otherwise impossible to have a struct holding a trait object, and at the same time can be used from downstream std crates in a way that doesn't interfere with catch_unwind.
```rust
// common library
#![no_std]
pub struct Thing {
pub(crate) x: &'static (dyn SomeTrait + Send + Sync),
}
pub(crate) trait SomeTrait {...}
```
```rust
// downstream application
fn main() {
let thing: library::Thing = ...;
let _ = std::panic::catch_unwind(|| { let _ = thing; }); // does not work :(
}
```
See a4131708e2/src/gradient.rs (L7-L15) for a real life example of needing to work around this problem. In particular that workaround would not even be viable if implementors of the trait were provided externally by a caller, as the `feature = "std"` would become non-additive in that case.
What happens without the UnwindSafe constraints:
```rust
fn main() {
let gradient = colorous::VIRIDIS;
let _ = std::panic::catch_unwind(|| { let _ = gradient; });
}
```
```console
error[E0277]: the type `(dyn colorous::gradient::EvalGradient + Send + Sync + 'static)` may contain interior mutability and a reference may not be safely transferrable across a catch_unwind boundary
--> src/main.rs:3:13
|
3 | let _ = std::panic::catch_unwind(|| { let _ = gradient; });
| ^^^^^^^^^^^^^^^^^^^^^^^^ `(dyn colorous::gradient::EvalGradient + Send + Sync + 'static)` may contain interior mutability and a reference may not be safely transferrable across a catch_unwind boundary
|
::: .rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panic.rs:430:40
|
430 | pub fn catch_unwind<F: FnOnce() -> R + UnwindSafe, R>(f: F) -> Result<R> {
| ---------- required by this bound in `catch_unwind`
|
= help: within `Gradient`, the trait `RefUnwindSafe` is not implemented for `(dyn colorous::gradient::EvalGradient + Send + Sync + 'static)`
= note: required because it appears within the type `&'static (dyn colorous::gradient::EvalGradient + Send + Sync + 'static)`
= note: required because it appears within the type `Gradient`
= note: required because of the requirements on the impl of `UnwindSafe` for `&Gradient`
= note: required because it appears within the type `[closure@src/main.rs:3:38: 3:62]`
```
Implement advance_by, advance_back_by for slice::{Iter, IterMut}
Part of #77404.
Picking up where #77633 was closed.
I have addressed https://github.com/rust-lang/rust/pull/77633#issuecomment-771842599 by restoring `nth` and `nth_back`. So according to that comment this should already be r=m-ou-se, but it has been sitting for a while.
Track caller of Vec::remove()
`vec.remove(invalid)` doesn't print a helpful source position:
> thread 'main' panicked at 'removal index (is 99) should be < len (is 1)', **library/alloc/src/vec/mod.rs:1379:13**
Add docs about performance and `Iterator::map` to `[T; N]::map`
This suboptimal code gen for some usages of array::map got a bit of
attention by multiple people throughout the community. Some cases:
- https://github.com/rust-lang/rust/issues/75243#issuecomment-866051086
- https://github.com/rust-lang/rust/issues/75243#issuecomment-874732134
- https://www.reddit.com/r/rust/comments/oeqqf7/unexpected_high_stack_usage/
My *guess* is that this gets the attention it gets because in JavaScript
(and potentially other languages), a `map` function on arrays is very
commonly used since in those languages, arrays basically take the role
of Rust's iterator. I considered explicitly naming JavaScript in the
first paragraph I added, but I couldn't find precedence of mentioning
other languages in standard library doc, so I didn't add it.
When array::map was stabilized, we still wanted to add docs, but that
somehow did not happen in time. So here we are. Not sure if this sounds
crazy but maybe it is worth considering beta backporting this? Only if
it's not a lot of work, of course! But yeah, stabilized array::map is
already in beta and if this problem is really as big as it sometimes seems,
might be worth having the docs in place when 1.55 is released.
CC ``@CryZe``
r? ``@m-ou-se`` (since you were involved in that discussion and the stabilization)
Make `SEMICOLON_IN_EXPRESSIONS_FROM_MACROS` warn by default
This PR makes the `SEMICOLON_IN_EXPRESSIONS_FROM_MACROS` lint warn by default.
To avoid showing a large number of un-actionable warnings to users, we only enable the lint for macros defined in the same crate. This ensures that users will be able to fix the warning by simply removing a semicolon.
In the future, I'd like to enable this lint unconditionally, and eventually make it into a hard error in a future edition. This PR is a step towards that goal.
[backtraces]: look for the `begin` symbol only after seeing `end`
On `x86_64-pc-windows-msvc`, we often get backtraces which look like
this:
```
10: 0x7ff77e0e9be5 - std::panicking::rust_panic_with_hook
11: 0x7ff77e0e11b4 - std::sys_common::backtrace::__rust_begin_short_backtrace::h5769736bdb11136c
12: 0x7ff77e0e116f - std::sys_common::backtrace::__rust_end_short_backtrace::h61c7ecb1b55338ae
13: 0x7ff77e0f89dd - std::panicking::begin_panic::h8e60ef9f82a41805
14: 0x7ff77e0e108c - d
15: 0x7ff77e0e1069 - c
16: 0x7ff77e0e1059 - b
17: 0x7ff77e0e1049 - a
18: 0x7ff77e0e1039 - core::ptr::drop_in_place<std::rt::lang_start<()>::{{closure}}>::h1bfcd14d5e15ba81
19: 0x7ff77e0e1186 - std::sys_common::backtrace::__rust_begin_short_backtrace::h5769736bdb11136c
20: 0x7ff77e0e100c - std::rt::lang_start::{{closure}}::ha054184bbf9921e3
```
Notice that `__rust_begin_short_backtrace` appears on frame 11 before
`__rust_end_short_backtrace` on frame 12. This is because in typical
release binaries without debug symbols, dbghelp.dll, which we use to walk
and symbolize the stack, does not know where CGU internal functions
start or end and so the closure invoked by `__rust_end_short_backtrace`
is incorrectly described as `__rust_begin_short_backtrace` because it
happens to be near that symbol.
While that can obviously change, this has been happening quite
consistently since #75048. Since this is a very small change to the std
and the change makes sense by itself, I think this is worth doing.
This doesn't completely resolve the situation for release binaries on
Windows, since without debug symbols, the stack printed can still show
incorrect symbol names (this is why the test uses `#[no_mangle]`) but it
does slightly improve the situation in that you see the same backtrace
you would see with `RUST_BACKTRACE=full` or in a debugger (without the
uninteresting bits at the top and bottom).
Fixes part of #87481
Update the examples in `String` and `VecDeque::retain`
The examples added in #60396 used a "clever" post-increment hack,
unrelated to the actual point of the examples. That hack was found
[confusing] in the users forum, and #81811 already changed the `Vec`
example to use a more direct iterator. This commit changes `String` and
`VecDeque` in the same way for consistency.
[confusing]: https://users.rust-lang.org/t/help-understand-strange-expression/62858
Optimize fmt::PadAdapter::wrap
After adding the first `write!` usage to my project and printing the result to the console, I noticed, that my binary contains the strings "called `Option::unwrap()` on a `None` value`" and more importantly "C:\Users\Patrick Fischer\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\fmt\builders.rs", with my release build being configured as follows:
```
[profile.release]
panic = "abort"
codegen-units = 1
strip = "symbols" # the important bit
lto = true
```
I am in a no_std environment and my custom panic handler is a simple `loop {}`. I did not expect the above information to be preserved. I heavily suspect the edited function to be the culprit. It contains the only direct use of `Option::unwrap` in the entire file and I tracked the symbols in the assembly to be used from the section `_ZN68_$LT$core..fmt..builders..PadAdapter$u20$as$u20$core..fmt..Write$GT$9write_str17ha1d5e5efe167202aE`.
Aside from me suspecting this function to be the culprit, the replaced code performs the same operation as `Option::insert`, but without the `unreachable_unchecked` optimization `Option::insert` provides. Therefore, it makes sense to me to use the more optimized version, instead.
As I don't change any semantics, I hope a simple pull request suffices.
Fix may not to appropriate might not or must not
I went through and changed occurrences of `may not` to be more explicit with `might not` and `must not`.
Since RFC 3052 soft deprecated the authors field anyway, hiding it from
crates.io, docs.rs, and making Cargo not add it by default, and it is
not generally up to date/useful information, we should remove it from
crates in this repo.
On `x86_64-pc-windows-msvc`, we often get backtraces which look like
this:
```
10: 0x7ff77e0e9be5 - std::panicking::rust_panic_with_hook
11: 0x7ff77e0e11b4 - std::sys_common::backtrace::__rust_begin_short_backtrace::h5769736bdb11136c
12: 0x7ff77e0e116f - std::sys_common::backtrace::__rust_end_short_backtrace::h61c7ecb1b55338ae
13: 0x7ff77e0f89dd - std::panicking::begin_panic::h8e60ef9f82a41805
14: 0x7ff77e0e108c - d
15: 0x7ff77e0e1069 - c
16: 0x7ff77e0e1059 - b
17: 0x7ff77e0e1049 - a
18: 0x7ff77e0e1039 - core::ptr::drop_in_place<std::rt::lang_start<()>::{{closure}}>::h1bfcd14d5e15ba81
19: 0x7ff77e0e1186 - std::sys_common::backtrace::__rust_begin_short_backtrace::h5769736bdb11136c
20: 0x7ff77e0e100c - std::rt::lang_start::{{closure}}::ha054184bbf9921e3
```
Notice that `__rust_begin_short_backtrace` appears on frame 11 before
`__rust_end_short_backtrace` on frame 12. This is because in typical
release binaries without debug symbols, dbghelp.dll, which we use to walk
and symbolize the stack, does not know where CGU internal functions
start or end and so the closure invoked by `__rust_end_short_backtrace`
is incorrectly described as `__rust_begin_short_backtrace` because it
happens to be near that symbol.
While that can obviously change, this has been happening quite
consistently since #75048. Since this is a very small change to the std
and the change makes sense by itself, I think this is worth doing.
This doesn't completely resolve the situation for release binaries on
Windows, since without debug symbols, the stack printed can still show
incorrect symbol names (this is why the test uses `#[no_mangle]`) but it
does slightly improve the situation in that you see the same backtrace
you would see with `RUST_BACKTRACE=full` or in a debugger (without the
uninteresting bits at the top and bottom).
I looked in stdlib and as @BurntSushi thought, `raw` is generally
used for raw pointers, or other hazardous kinds of thing. stdlib does
not have `into_parts` apart from the one I added to `IntoInnerError`.
I did an ad-hoc search of the rustdocs for my current game project
Otter, which includes quite a large number of dependencies.
`into_parts` seems heavily used for things quite like this.
So change this name.
Suggested-by: Andrew Gallant <jamslam@gmail.com>
Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
I didn't notice the submodule, which means I failed to re-export this
to make it actually-public.
Reported-by: Andrew Gallant <jamslam@gmail.com>
Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
Add #[track_caller] for some function in core::mem.
These functions can panic for some types. This makes the panic point to the code that calls e.g. mem::uninitialized(), instead of inside the definition of mem::uninitialized.
Make const panic!("..") work in Rust 2021.
During const eval, this replaces calls to core::panicking::panic_fmt and std::panicking::being_panic_fmt with a call to a new const fn: core::panicking::const_panic_fmt. That function uses fmt::Arguments::as_str() to get the str and calls panic_str with that instead.
panic!() invocations with formatting arguments are still not accepted, as the creation of such a fmt::Arguments cannot be done in constant functions right now.
r? `@RalfJung`
Remove unsound TrustedRandomAccess implementations
Removes the implementations that depend on the user-definable trait `Copy`.
Fixes#85873 in the most straightforward way.
<hr>
_Edit:_ This PR now contains additional trait infrastructure to avoid performance regressions around in-place collect, see the discussion in this thread starting from the codegen test failure at https://github.com/rust-lang/rust/pull/85874#issuecomment-872327577.
With this PR, `TrustedRandomAccess` gains additional documentation that specifically allows for and specifies the safety conditions around subtype coercions – those coercions can happen in safe Rust code with the `Zip` API’s usage of `TrustedRandomAccess`. This PR introduces a new supertrait of `TrustedRandomAccess`(currently named `TrustedRandomAccessNoCoerce`) that _doesn’t allow_ such coercions, which means it can be still be useful for optimizing cases such as in-place collect where no iterator is handed out to a user (who could do coercions) after a `get_unchecked` call; the benefit of the supertrait is that it doesn’t come with the additional safety conditions around supertraits either, so it can be implemented for more types than `TrustedRandomAccess`.
The `TrustedRandomAccess` implementations for `vec::IntoIter`, `vec_deque::IntoIter`, and `array::IntoIter` are removed as they don’t conform with the newly documented safety conditions, this way unsoundness is removed. But this PR in turn (re-)adds a `TrustedRandomAccessNoCoerce` implementation for `vec::IntoIter` to avoid performance regressions from stable in a case of in-place collecting of `Vec`s [the above-mentioned codegen test failure]. Re-introducing the (currently nightly+beta-only) impls for `VecDeque`’s and `[T; N]`’s iterators is technically possible, but goes beyond the scope of this PR (i.e. it can happen in a future PR).
The examples added in #60396 used a "clever" post-increment hack,
unrelated to the actual point of the examples. That hack was found
[confusing] in the users forum, and #81811 already changed the `Vec`
example to use a more direct iterator. This commit changes `String` and
`VecDeque` in the same way for consistency.
[confusing]: https://users.rust-lang.org/t/help-understand-strange-expression/62858
Remove P: Unpin bound on impl Future for Pin
We can safely produce a `Pin<&mut P::Target>` without moving out of the `Pin` by using `Pin::as_mut` directly.
The `Unpin` bound was originally added in #56939 following the recommendation of ``@withoutboats`` in https://github.com/rust-lang/rust/issues/55766#issue-378417538
That comment does not give explicit justification for why the bound should be added. The relevant context was:
> [ ] Remove `impl<P> Unpin for Pin<P>`
>
> This impl is not justified by our standard justification for unpin impls: there is no pointer direction between `Pin<P>` and `P`. Its usefulness is covered by the impls for pointers themselves.
>
> This futures impl (link to the impl changed in this PR) will need to change to add a `P: Unpin` bound.
The decision to remove the unconditional impl of `Unpin for Pin` is sound (these days there is just an auto-impl for when `P: Unpin`). But, I think the decision to also add the `Unpin` bound for `impl Future` may have been unnecessary. Or if that's not the case, I'd be very interested to have the argument for why written down somewhere. The bound _appears_ to not be needed, as demonstrated by the change requiring no unsafe code and by the existence of `Pin::as_mut`.
Stabilize core::task::ready!
_Tracking issue: https://github.com/rust-lang/rust/issues/70922_
This PR stabilizes the `task::ready!` macro. Similar to https://github.com/rust-lang/rust/pull/80886, this PR was waiting on https://github.com/rust-lang/rust/issues/74355 to be fixed.
The `task::ready!` API has existed in the futures ecosystem for several years, and was added on nightly last year in https://github.com/rust-lang/rust/pull/70817. The motivation for this macro is the same as it was back then: virtually every single manual future implementation makes use of this; so much so that it's one of the few things included in the [futures-core](https://docs.rs/futures-core/0.3.12/futures_core) library.
r? ``@tmandry``
cc/ ``@rust-lang/wg-async-foundations`` ``@rust-lang/libs``
## Example
```rust
use core::task::{Context, Poll};
use core::future::Future;
use core::pin::Pin;
async fn get_num() -> usize {
42
}
pub fn do_poll(cx: &mut Context<'_>) -> Poll<()> {
let mut f = get_num();
let f = unsafe { Pin::new_unchecked(&mut f) };
let num = ready!(f.poll(cx));
// ... use num
Poll::Ready(())
}
```
During const eval, this replaces calls to core::panicking::panic_fmt and
std::panicking::being_panic_fmt with a call to a new const fn:
core::panicking::const_panic_fmt. That function uses
fmt::Arguments::as_str() to get the str and calls panic_str with that
instead.
panic!() invocations with formatting arguments are still not accepted,
as the creation of such a fmt::Arguments cannot be done in constant
functions right now.
These functions can panic for some types. This makes the panic point to
the code that calls e.g. mem::uninitialized(), instead of inside the
definition of mem::uninitialized.
Include new details regarding coercions to a subtype.
These conditions also explain why the previously removed implementations
for {array, vec, vec_deque}::IntoIter<T> were unsound, because they introduced
an extra `T: Clone` for the TrustedRandomAccess impl, even though their parameter T
is covariant.
Document math behind MIN/MAX consts on integers
Currently the documentation for `[integer]::{MIN, MAX}` doesn't explain where the constants come from. This documents how the values of those constants are related to powers of 2.
Use hashbrown's `extend_reserve()` in `HashMap`
When we added `extend_reserve()` to our implementation of `Extend` for `HashMap`, hashbrown didn't have a version we could use. Now that hashbrown has added it, we should use its version instead of implementing it ourself.
Implement RFC 3107: `#[derive(Default)]` on enums with a `#[default]` attribute
This PR implements RFC 3107, which permits `#[derive(Default)]` on enums where a unit variant has a `#[default]` attribute. See comments for current status.
Update VxWork's UNIX support
1. VxWorks does not provide glibc
2. VxWorks does provide `sigemptyset` and `sigaddset`
Note: these changes are concurrent to [this PR](https://github.com/rust-lang/libc/pull/2295) in libc.
implement fold() on array::IntoIter to improve flatten().collect() perf
With #87168 flattening `array::IntoIter`s is now `TrustedLen`, the `FromIterator` implementation for `Vec` has a specialization for `TrustedLen` iterators which uses internal iteration. This implements one of the main internal iteration methods on `array::Into` to optimize the combination of those two features.
This should address the main issue in #87411
```
# old
test vec::bench_flat_map_collect ... bench: 2,244,024 ns/iter (+/- 18,903)
# new
test vec::bench_flat_map_collect ... bench: 172,863 ns/iter (+/- 2,141)
```
Make StrSearcher behave correctly on empty needle
Fix#85462.
This will not affect ABI since the other variant of the enum is bigger.
It may break some code, but that would be very strange: usually people
don't continue after the first `Done` (or `None` for a normal iterator).
`@rustbot` label T-libs A-str A-patterns
Add support for custom allocator in `VecDeque`
This follows the [roadmap](https://github.com/rust-lang/wg-allocators/issues/7) of the allocator WG to add custom allocators to collections.
`@rustbot` modify labels: +A-allocators +T-libs
Stabilize `impl From<[(K, V); N]> for HashMap` (and friends)
In addition to allowing HashMap to participate in Into/From conversion, this adds the long-requested ability to use constructor-like syntax for initializing a HashMap:
```rust
let map = HashMap::from([
(1, 2),
(3, 4),
(5, 6)
]);
```
This addition is highly motivated by existing precedence, e.g. it is already possible to similarly construct a Vec from a fixed-size array:
```rust
let vec = Vec::from([1, 2, 3]);
```
...and it is already possible to collect a Vec of tuples into a HashMap (and vice-versa):
```rust
let vec = Vec::from([(1, 2)]);
let map: HashMap<_, _> = vec.into_iter().collect();
let vec: Vec<(_, _)> = map.into_iter().collect();
```
...and of course it is likewise possible to collect a fixed-size array of tuples into a HashMap ([but not vice-versa just yet](https://github.com/rust-lang/rust/issues/81615)):
```rust
let arr = [(1, 2)];
let map: HashMap<_, _> = std::array::IntoIter::new(arr).collect();
```
Therefore this addition seems like a no-brainer.
As for any impl, this would be insta-stable.
DOC: remove unnecessary feature crate attribute from example code
I'm not sure whether I fully understand the stabilization process (I most likely don't), but I think this attribute isn't necessary here, right?
This was recently stabilized in #86344.
Stabilize `into_parts()` and `into_error()`
This stabilizes `IntoInnerError`'s `into_parts()` and `into_error()` methods, currently gated behind the `io_into_inner_error_parts` feature. The FCP has [already completed.](https://github.com/rust-lang/rust/issues/79704#issuecomment-880652967)
Closes#79704.
Document iteration order of `retain` functions
For `HashSet` and `HashMap`, this simply copies the comment from
`BinaryHeap::retain`.
For `BTreeSet` and `BTreeMap`, this adds an additional guarantee that
wasn't previously documented. I think that because these data structures
are inherently ordered and other functions guarantee ordered iteration,
it makes sense to provide this guarantee for `retain` as well.
DOC: fix hypothetical Rust code in `step_by()` docstring
I don't know how important that is, but if I'm not mistaken, the hypothetical code in the docstring of `step_by()` (see https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.step_by) isn't correct.
I guess writing `next()` instead of `self.next()` isn't a biggie, but this would also imply that `advance_n_and_return_first()` is a method, which AFAICT it isn't.
I've also done some re-formatting in a separate commit and a parameter renaming in yet another commit.
Feel free to take or leave any combination of those commits.
Regression fix to avoid further beta backports: Remove unsound TrustedRandomAccess implementations
Removes the implementations that depend on the user-definable trait `Copy`.
Only fix regressions to ensure merge in 1.55: Does not modify `vec::IntoIter`.
<hr>
This PR applies the beta-`1.53` backport #86222 (merged as part of #86225), a reduced version of #85874 that only fixes regressions, to `master` in order to avoid the need for further backports from `1.55` onwards. Beta-`1.54` backport already happened with #87136. In case that #85874 gets merged quickly (within a week), this PR would be unnecessary.
r? `@cuviper`
docs: GlobalAlloc: completely replace example with one that works
Since this is an example, this could really do with some review from someone familiar with unsafe stuff!
I made the example no longer `no_run` since it works for me.
Fixes#81847
Add comments explaining the unix command-line argument support.
Following up on #87236, add comments to the unix command-line argument
support explaining that the code doesn't mutate the system-provided
argc/argv, and that this is why the code doesn't need a lock or special
memory ordering.
r? ```@RalfJung```
Removes the implementations that depend on the user-definable trait `Copy`.
Only fix regressions to ensure merge in 1.55: Does not modify `vec::IntoIter`.
Background:
Over the last year, pidfd support was added to the Linux kernel. This
allows interacting with other processes. In particular, this allows
waiting on a child process with a timeout in a race-free way, bypassing
all of the awful signal-handler tricks that are usually required.
Pidfds can be obtained for a child process (as well as any other
process) via the `pidfd_open` syscall. Unfortunately, this requires
several conditions to hold in order to be race-free (i.e. the pid is not
reused).
Per `man pidfd_open`:
```
· the disposition of SIGCHLD has not been explicitly set to SIG_IGN
(see sigaction(2));
· the SA_NOCLDWAIT flag was not specified while establishing a han‐
dler for SIGCHLD or while setting the disposition of that signal to
SIG_DFL (see sigaction(2)); and
· the zombie process was not reaped elsewhere in the program (e.g.,
either by an asynchronously executed signal handler or by wait(2)
or similar in another thread).
If any of these conditions does not hold, then the child process
(along with a PID file descriptor that refers to it) should instead
be created using clone(2) with the CLONE_PIDFD flag.
```
Sadly, these conditions are impossible to guarantee once any libraries
are used. For example, C code runnng in a different thread could call
`wait()`, which is impossible to detect from Rust code trying to open a
pidfd.
While pid reuse issues should (hopefully) be rare in practice, we can do
better. By passing the `CLONE_PIDFD` flag to `clone()` or `clone3()`, we
can obtain a pidfd for the child process in a guaranteed race-free
manner.
This PR:
This PR adds Linux-specific process extension methods to allow obtaining
pidfds for processes spawned via the standard `Command` API. Other than
being made available to user code, the standard library does not make
use of these pidfds in any way. In particular, the implementation of
`Child::wait` is completely unchanged.
Two Linux-specific helper methods are added: `CommandExt::create_pidfd`
and `ChildExt::pidfd`. These methods are intended to serve as a building
block for libraries to build higher-level abstractions - in particular,
waiting on a process with a timeout.
I've included a basic test, which verifies that pidfds are created iff
the `create_pidfd` method is used. This test is somewhat special - it
should always succeed on systems with the `clone3` system call
available, and always fail on systems without `clone3` available. I'm
not sure how to best ensure this programatically.
This PR relies on the newer `clone3` system call to pass the `CLONE_FD`,
rather than the older `clone` system call. `clone3` was added to Linux
in the same release as pidfds, so this shouldn't unnecessarily limit the
kernel versions that this code supports.
Unresolved questions:
* What should the name of the feature gate be for these newly added
methods?
* Should the `pidfd` method distinguish between an error occurring
and `create_pidfd` not being called?
add `Stdin::lines`, `Stdin::split` forwarder methods
Add forwarder methods `Stdin::lines` and `Stdin::split`, which consume
and lock a `Stdin` handle, and forward on to the corresponding `BufRead`
methods. This should make it easier for beginners to use those iterator
constructors without explicitly dealing with locks or lifetimes.
Replaces #86412.
~~Based on #86846 to get the tracking issue number for the `stdio_locked` feature.~~ Rebased after merge, so it's only one commit now.
r? `@joshtriplett`
`@rustbot` label +A-io +C-enhancement +D-newcomer-roadblock +T-libs-api
implement TrustedLen for Flatten/FlatMap if the U: IntoIterator == [T; N]
This only works if arrays are passed directly instead of array iterators
because we need to be sure that they have not been advanced before
Flatten does its size calculation.
resolves#87094
Merge libterm into libtest
I think it's quite clear at this point that rust won't stablize the current libterm APIs to the outside world. And its only user is libtest. The compiler doesn't use this api at all. So I'm merging the crate into libtest as a module.
This also allows me to remove 15% of the libterm code, since these APIs are dead-code now.
Since this is an example, this could really do with some review from
someone familiar with unsafe stuff !
I made the example no longer `no_run` since it works for me.
Fixes#81847
Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
Co-authored-by: Amanieu d'Antras <amanieu@gmail.com>
Following up on #87236, add comments to the unix command-line argument
support explaining that the code doesn't mutate the system-provided
argc/argv, and that this is why the code doesn't need a lock or special
memory ordering.
Simplify command-line argument initialization on unix
Simplify Rust's command-line argument initialization code on unix:
- The cleanup code isn't needed, because it was just zeroing out non-owning variables at runtime cleanup time. After 91c3eee173, Rust's command-line initialization code on unix no longer allocates `CString`s and a `Vec` at startup time.
- The `Mutex` isn't needed; if there's somehow a call to `args()` before argument initialization has happened, the code returns return an empty list, which we can do with a null check.
With these changes, a simple cdylib that doesn't use threads avoids getting `pthread_mutex_lock`/`pthread_mutex_unlock` in its symbol table.
Move asm! and global_asm! to core::arch
Follow-up to https://github.com/rust-lang/stdarch/pull/1183 .
Implements the libs-api team decision from rust-lang/rust#84019 (comment) .
In order to not break nightly users, this PR also adds the newly-moved items to the prelude. However, a decision will need to be made before stabilization as to whether these items should remain in the prelude. I will file an issue for this separately.
Fixes#84019 .
r? `@Amanieu`
Mark `Option::insert` as must_use
Some people seems misled by the function name and use it in case where a simple assignment just works.
If the return value is not used, `option = Some(value);` should be preferred instead of `option.insert(value);`
Add diagnostic items for Clippy
This adds a bunch of diagnostic items to `std`/`core`/`alloc` functions, structs and traits used in Clippy. The actual refactorings in Clippy to use these items will be done in a different PR in Clippy after the next sync.
This PR doesn't include all paths Clippy uses, I've only gone through the first 85 lines of Clippy's [`paths.rs`](ecf85f4bdc/clippy_utils/src/paths.rs) (after rust-lang/rust-clippy#7466) to get some feedback early on. I've also decided against adding diagnostic items to methods, as it would be nicer and more scalable to access them in a nicer fashion, like adding a `is_diagnostic_assoc_item(did, sym::Iterator, sym::map)` function or something similar (Suggested by `@camsteffen` [on Zulip](https://rust-lang.zulipchat.com/#narrow/stream/147480-t-compiler.2Fwg-diagnostics/topic/Diagnostic.20Item.20Naming.20Convention.3F/near/225024603))
There seems to be some different naming conventions when it comes to diagnostic items, some use UpperCamelCase (`BinaryHeap`) and some snake_case (`hashmap_type`). This PR uses UpperCamelCase for structs and traits and snake_case with the module name as a prefix for functions. Any feedback on is this welcome.
cc: rust-lang/rust-clippy#5393
r? `@Manishearth`
In the command-line argument initialization code, remove the Mutex
around the `ARGV` and `ARGC` variables, and simply check whether
ARGV is non-null before dereferencing it. This way, if either of
ARGV or ARGC is not initialized, we'll get an empty argument list.
This allows simple cdylibs to avoid having
`pthread_mutex_lock`/`pthread_mutex_unlock` appear in their symbol
tables if they don't otherwise use threads.
Update Rust Float-Parsing Algorithms to use the Eisel-Lemire algorithm.
# Summary
Rust, although it implements a correct float parser, has major performance issues in float parsing. Even for common floats, the performance can be 3-10x [slower](https://arxiv.org/pdf/2101.11408.pdf) than external libraries such as [lexical](https://github.com/Alexhuszagh/rust-lexical) and [fast-float-rust](https://github.com/aldanor/fast-float-rust).
Recently, major advances in float-parsing algorithms have been developed by Daniel Lemire, along with others, and implement a fast, performant, and correct float parser, with speeds up to 1200 MiB/s on Apple's M1 architecture for the [canada](0e2b5d163d/data/canada.txt) dataset, 10x faster than Rust's 130 MiB/s.
In addition, [edge-cases](https://github.com/rust-lang/rust/issues/85234) in Rust's [dec2flt](868c702d0c/library/core/src/num/dec2flt) algorithm can lead to over a 1600x slowdown relative to efficient algorithms. This is due to the use of Clinger's correct, but slow [AlgorithmM and Bellepheron](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.45.4152&rep=rep1&type=pdf), which have been improved by faster big-integer algorithms and the Eisel-Lemire algorithm, respectively.
Finally, this algorithm provides substantial improvements in the number of floats the Rust core library can parse. Denormal floats with a large number of digits cannot be parsed, due to use of the `Big32x40`, which simply does not have enough digits to round a float correctly. Using a custom decimal class, with much simpler logic, we can parse all valid decimal strings of any digit count.
```rust
// Issue in Rust's dec2fly.
"2.47032822920623272088284396434110686182e-324".parse::<f64>(); // Err(ParseFloatError { kind: Invalid })
```
# Solution
This pull request implements the Eisel-Lemire algorithm, modified from [fast-float-rust](https://github.com/aldanor/fast-float-rust) (which is licensed under Apache 2.0/MIT), along with numerous modifications to make it more amenable to inclusion in the Rust core library. The following describes both features in fast-float-rust and improvements in fast-float-rust for inclusion in core.
**Documentation**
Extensive documentation has been added to ensure the code base may be maintained by others, which explains the algorithms as well as various associated constants and routines. For example, two seemingly magical constants include documentation to describe how they were derived as follows:
```rust
// Round-to-even only happens for negative values of q
// when q ≥ −4 in the 64-bit case and when q ≥ −17 in
// the 32-bitcase.
//
// When q ≥ 0,we have that 5^q ≤ 2m+1. In the 64-bit case,we
// have 5^q ≤ 2m+1 ≤ 2^54 or q ≤ 23. In the 32-bit case,we have
// 5^q ≤ 2m+1 ≤ 2^25 or q ≤ 10.
//
// When q < 0, we have w ≥ (2m+1)×5^−q. We must have that w < 2^64
// so (2m+1)×5^−q < 2^64. We have that 2m+1 > 2^53 (64-bit case)
// or 2m+1 > 2^24 (32-bit case). Hence,we must have 2^53×5^−q < 2^64
// (64-bit) and 2^24×5^−q < 2^64 (32-bit). Hence we have 5^−q < 2^11
// or q ≥ −4 (64-bit case) and 5^−q < 2^40 or q ≥ −17 (32-bitcase).
//
// Thus we have that we only need to round ties to even when
// we have that q ∈ [−4,23](in the 64-bit case) or q∈[−17,10]
// (in the 32-bit case). In both cases,the power of five(5^|q|)
// fits in a 64-bit word.
const MIN_EXPONENT_ROUND_TO_EVEN: i32;
const MAX_EXPONENT_ROUND_TO_EVEN: i32;
```
This ensures maintainability of the code base.
**Improvements for Disguised Fast-Path Cases**
The fast path in float parsing algorithms attempts to use native, machine floats to represent both the significant digits and the exponent, which is only possible if both can be exactly represented without rounding. In practice, this means that the significant digits must be 53-bits or less and the then exponent must be in the range `[-22, 22]` (for an f64). This is similar to the existing dec2flt implementation.
However, disguised fast-path cases exist, where there are few significant digits and an exponent above the valid range, such as `1.23e25`. In this case, powers-of-10 may be shifted from the exponent to the significant digits, discussed at length in https://github.com/rust-lang/rust/issues/85198.
**Digit Parsing Improvements**
Typically, integers are parsed from string 1-at-a-time, requiring unnecessary multiplications which can slow down parsing. An approach to parse 8 digits at a time using only 3 multiplications is described in length [here](https://johnnylee-sde.github.io/Fast-numeric-string-to-int/). This leads to significant performance improvements, and is implemented for both big and little-endian systems.
**Unsafe Changes**
Relative to fast-float-rust, this library makes less use of unsafe functionality and clearly documents it. This includes the refactoring and documentation of numerous unsafe methods undesirably marked as safe. The original code would look something like this, which is deceptively marked as safe for unsafe functionality.
```rust
impl AsciiStr {
#[inline]
pub fn step_by(&mut self, n: usize) -> &mut Self {
unsafe { self.ptr = self.ptr.add(n) };
self
}
}
...
#[inline]
fn parse_scientific(s: &mut AsciiStr<'_>) -> i64 {
// the first character is 'e'/'E' and scientific mode is enabled
let start = *s;
s.step();
...
}
```
The new code clearly documents safety concerns, and does not mark unsafe functionality as safe, leading to better safety guarantees.
```rust
impl AsciiStr {
/// Advance the view by n, advancing it in-place to (n..).
pub unsafe fn step_by(&mut self, n: usize) -> &mut Self {
// SAFETY: same as step_by, safe as long n is less than the buffer length
self.ptr = unsafe { self.ptr.add(n) };
self
}
}
...
/// Parse the scientific notation component of a float.
fn parse_scientific(s: &mut AsciiStr<'_>) -> i64 {
let start = *s;
// SAFETY: the first character is 'e'/'E' and scientific mode is enabled
unsafe {
s.step();
}
...
}
```
This allows us to trivially demonstrate the new implementation of dec2flt is safe.
**Inline Annotations Have Been Removed**
In the previous implementation of dec2flt, inline annotations exist practically nowhere in the entire module. Therefore, these annotations have been removed, which mostly does not impact [performance](https://github.com/aldanor/fast-float-rust/issues/15#issuecomment-864485157).
**Fixed Correctness Tests**
Numerous compile errors in `src/etc/test-float-parse` were present, due to deprecation of `time.clock()`, as well as the crate dependencies with `rand`. The tests have therefore been reworked as a [crate](https://github.com/Alexhuszagh/rust/tree/master/src/etc/test-float-parse), and any errors in `runtests.py` have been patched.
**Undefined Behavior**
An implementation of `check_len` which relied on undefined behavior (in fast-float-rust) has been refactored, to ensure that the behavior is well-defined. The original code is as follows:
```rust
#[inline]
pub fn check_len(&self, n: usize) -> bool {
unsafe { self.ptr.add(n) <= self.end }
}
```
And the new implementation is as follows:
```rust
/// Check if the slice at least `n` length.
fn check_len(&self, n: usize) -> bool {
n <= self.as_ref().len()
}
```
Note that this has since been fixed in [fast-float-rust](https://github.com/aldanor/fast-float-rust/pull/29).
**Inferring Binary Exponents**
Rather than explicitly store binary exponents, this new implementation infers them from the decimal exponent, reducing the amount of static storage required. This removes the requirement to store [611 i16s](868c702d0c/library/core/src/num/dec2flt/table.rs (L8)).
# Code Size
The code size, for all optimizations, does not considerably change relative to before for stripped builds, however it is **significantly** smaller prior to stripping the resulting binaries. These binary sizes were calculated on x86_64-unknown-linux-gnu.
**new**
Using rustc version 1.55.0-dev.
opt-level|size|size(stripped)
|:-:|:-:|:-:|
0|400k|300K
1|396k|292K
2|392k|292K
3|392k|296K
s|396k|292K
z|396k|292K
**old**
Using rustc version 1.53.0-nightly.
opt-level|size|size(stripped)
|:-:|:-:|:-:|
0|3.2M|304K
1|3.2M|292K
2|3.1M|284K
3|3.1M|284K
s|3.1M|284K
z|3.1M|284K
# Correctness
The dec2flt implementation passes all of Rust's unittests and comprehensive float parsing tests, along with numerous other tests such as Nigel Toa's comprehensive float [tests](https://github.com/nigeltao/parse-number-fxx-test-data) and Hrvoje Abraham [strtod_tests](https://github.com/ahrvoje/numerics/blob/master/strtod/strtod_tests.toml). Therefore, it is unlikely that this algorithm will incorrectly round parsed floats.
# Issues Addressed
This will fix and close the following issues:
- resolves#85198
- resolves#85214
- resolves#85234
- fixes#31407
- fixes#31109
- fixes#53015
- resolves#68396
- closes https://github.com/aldanor/fast-float-rust/issues/15
Implementation is based off fast-float-rust, with a few notable changes.
- Some unsafe methods have been removed.
- Safe methods with inherently unsafe functionality have been removed.
- All unsafe functionality is documented and provably safe.
- Extensive documentation has been added for simpler maintenance.
- Inline annotations on internal routines has been removed.
- Fixed Python errors in src/etc/test-float-parse/runtests.py.
- Updated test-float-parse to be a library, to avoid missing rand dependency.
- Added regression tests for #31109 and #31407 in core tests.
- Added regression tests for #31109 and #31407 in ui tests.
- Use the existing slice primitive to simplify shared dec2flt methods
- Remove Miri ignores from dec2flt, due to faster parsing times.
- resolves#85198
- resolves#85214
- resolves#85234
- fixes#31407
- fixes#31109
- fixes#53015
- resolves#68396
- closes https://github.com/aldanor/fast-float-rust/issues/15
Due to #20400 the corresponding TrustedLen impls need a helper trait
instead of directly adding `Item = &[T;N]` bounds.
Since TrustedLen is a public trait this in turn means
the helper trait needs to be public. Since it's just a workaround
for a compiler deficit it's marked hidden, unstable and unsafe.
Correct invariant documentation for `steps_between`
Given that the previous example involves stepping forward from A to B, the equivalent example on this line would make most sense as stepping backward from B to A.
I should probably add a caveat here that I’m fairly new to Rust, and this is my first contribution to this repo, so it’s very possible that I’ve misunderstood how this is supposed to work (either on a technical level or a social one). If this is the case, please do let me know.
This only works if arrays are passed directly instead of array iterators
because we need to be sure that they have not been advanced before
Flatten does its size calculation.
This is a follow-up change to the fix for #75598. It simplifies the implementation of wrapping_neg() for all integer types by just calling 0.wrapping_sub(self) and always inlines it. This leads to much less assembly code being emitted for opt-level≤1.
Make the specialized Fuse still deal with None
Fixes#85863 by removing the assumption that we'll never see a cleared iterator in the `I: FusedIterator` specialization. Now all `Fuse` methods check for the possibility that `self.iter` is `None`, and the specialization only avoids _setting_ that to `None` in `&mut self` methods.
Given that the previous example involves stepping forward from A to B,
the equivalent example on this line would make most sense as stepping
backward from B to A.
expand: Support helper attributes for built-in derive macros
This is needed for https://github.com/rust-lang/rust/pull/86735 (derive macro `Default` should have a helper attribute `default`).
With this PR we can specify helper attributes for built-in derives using syntax `#[rustc_builtin_macro(MacroName, attributes(attr1, attr2, ...))]` which mirrors equivalent syntax for proc macros `#[proc_macro_derive(MacroName, attributes(attr1, attr2, ...))]`.
Otherwise expansion infra was already ready for this.
The attribute parsing code is shared between proc macro derives and built-in macros (`fn parse_macro_name_and_helper_attrs`).
Rollup of 6 pull requests
Successful merges:
- #87085 (Search result colors)
- #87090 (Make BTreeSet::split_off name elements like other set methods do)
- #87098 (Unignore some pretty printing tests)
- #87099 (Upgrade `cc` crate to 1.0.69)
- #87101 (Suggest a path separator if a stray colon is found in a match arm)
- #87102 (Add GUI test for "go to first" feature)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
create method overview docs for core::option and core::result
The `Option` and `Result` types have large lists of methods. They each could use an overview page of methods grouped by category. These proposed overviews include "truth tables" for the underappreciated boolean operators/combinators of these types. The methods are already somewhat categorized in the source, but some logical groupings are broken up by the necessities of putting related methods in different `impl` blocks, for example.
This is based on #86209, but those are small changes and unlikely to conflict.
Add forwarder methods `Stdin::lines` and `Stdin::split`, which consume
and lock a `Stdin` handle, and forward on to the corresponding `BufRead`
methods. This should make it easier for beginners to use those iterator
constructors without explicitly dealing with locks or lifetimes.
stdio_locked: add tracking issue
Add the tracking issue number #86845 to the stability attributes for the implementation in #86799.
r? `@joshtriplett`
`@rustbot` label +A-io +C-cleanup +T-libs-api
Remove unstable `io::Cursor::remaining`
Adding `io::Cursor::remaining` in #86037 caused a conflict with the implementation of `bytes::Buf` for `io::Cursor`, leading to an error in nightly, see https://github.com/rust-lang/rust/issues/86369#issuecomment-867723485.
This fixes the error by temporarily removing the `remaining` function.
r? `@yaahc`
Split MaybeUninit::write into new feature gate and stabilize it
This splits off the `MaybeUninit::write` function from the `maybe_uninit_extra` feature gate into a new `maybe_uninit_write` feature gate and stabilizes it.
Earlier work to improve the documentation of the write function: #86220
Tracking issue: #63567
[docs] Clarify behaviour of f64 and f32::sqrt when argument is negative zero
From IEEE 754 section 6.3:
> Except that squareRoot(−0) shall be −0, every numeric squareRoot result shall have a positive sign.
This will not affect ABI since the other variant of the enum is bigger.
It may break some code, but that would be very strange: usually people
don't continue after the first `Done` (or `None` for a normal iterator).