Implement read_buf for TcpStream, Stdin, StdinLock, ChildStdout,
ChildStderr (and internally for AnonPipe, Handle, Socket), so
that it skips buffer initialization.
The other provided methods like read_to_string and read_to_end are
implemented in terms of read_buf and so benefit from the optimization
as well.
This commit also implements read_vectored and is_read_vectored where
applicable.
The UNIX and WASI implementations use `isatty`. The Windows
implementation uses the same logic the `atty` crate uses, including the
hack needed to detect msys terminals.
Implement this trait for `File` and for `Stdin`/`Stdout`/`Stderr` and
their locked counterparts on all platforms. On UNIX and WASI, implement
it for `BorrowedFd`/`OwnedFd`. On Windows, implement it for
`BorrowedHandle`/`OwnedHandle`.
Based on https://github.com/rust-lang/rust/pull/91121
Co-authored-by: Matt Wilkinson <mattwilki17@gmail.com>
Printing to stdio/stderr that have been opened with non-blocking
(O_NONBLOCK in linux) can result in an error, which is not handled
by std::io module causing a panic.
Signed-off-by: Usama Arif <usama.arif@bytedance.com>
Make `ReentrantMutex` movable and `const`
As `MovableMutex` is now `const`, it can be used to simplify the implementation and interface of the internal reentrant mutex type. Consequently, the standard error stream does not need to be wrapped in `OnceLock` and `OnceLock::get_or_init_pin()` can be removed.
Consistently present absent stdio handles on Windows as NULL handles.
This addresses #90964 by making the std API consistent about presenting
absent stdio handles on Windows as NULL handles. Stdio handles may be
absent due to `#![windows_subsystem = "windows"]`, due to the console
being detached, or due to a child process having been launched from a
parent where stdio handles are absent.
Specifically, this fixes the case of child processes of parents with absent
stdio, which previously ended up with `stdin().as_raw_handle()` returning
`INVALID_HANDLE_VALUE`, which was surprising, and which overlapped with an
unrelated valid handle value. With this patch, `stdin().as_raw_handle()`
now returns null in these situation, which is consistent with what it
does in the parent process.
And, document this in the "Windows Portability Considerations" sections of
the relevant documentation.
This updates the standard library's documentation to use the new syntax. The
documentation is worthwhile to update as it should be more idiomatic
(particularly for features like this, which are nice for users to get acquainted
with). The general codebase is likely more hassle than benefit to update: it'll
hurt git blame, and generally updates can be done by folks updating the code if
(and when) that makes things more readable with the new format.
A few places in the compiler and library code are updated (mostly just due to
already having been done when this commit was first authored).
This addresses #90964 by making the std API consistent about presenting
absent stdio handles on Windows as NULL handles. Stdio handles may be
absent due to `#![windows_subsystem = "windows"]`, due to the console
being detached, or due to a child process having been launched from a
parent where stdio handles are absent.
Specifically, this fixes the case of child processes of parents with absent
stdio, which previously ended up with `stdin().as_raw_handle()` returning
`INVALID_HANDLE_VALUE`, which was surprising, and which overlapped with an
unrelated valid handle value. With this patch, `stdin().as_raw_handle()`
now returns null in these situation, which is consistent with what it
does in the parent process.
And, document this in the "Windows Portability Considerations" sections of
the relevant documentation.
The unifying theme for this commit is weak, admittedly. I put together a
list of "expensive" functions when I originally proposed this whole
effort, but nobody's cared about that criterion. Still, it's a decent
way to bite off a not-too-big chunk of work.
Given the grab bag nature of this commit, the messages I used vary quite
a bit.
Add forwarder methods `Stdin::lines` and `Stdin::split`, which consume
and lock a `Stdin` handle, and forward on to the corresponding `BufRead`
methods. This should make it easier for beginners to use those iterator
constructors without explicitly dealing with locks or lifetimes.
Add stderr_locked, stdin_locked, and stdout_locked free functions
to obtain owned locked stdio handles in a single step. Also add
into_lock methods to consume a stdio handle and return an owned
lock. These methods will make it easier to use locked stdio
handles without having to deal with lifetime problems or keeping
bindings to the unlocked handles around.
The code in io::stdio before this change misused the ReentrantMutexes,
by calling init() on them and moving them afterwards. Now that
ReentrantMutex requires Pin for init(), this mistake is no longer easy
to make.
Simplify output capturing
This is a sequence of incremental improvements to the unstable/internal `set_panic` and `set_print` mechanism used by the `test` crate:
1. Remove the `LocalOutput` trait and use `Arc<Mutex<dyn Write>>` instead of `Box<dyn LocalOutput>`. In practice, all implementations of `LocalOutput` were just `Arc<Mutex<..>>`. This simplifies some logic and removes all custom `Sink` implementations such as `library/test/src/helpers/sink.rs`. Also removes a layer of indirection, as the outermost `Box` is now gone. It also means that locking now happens per `write_fmt`, not per individual `write` within. (So `"{} {}\n"` now results in one `lock()`, not four or more.)
2. Since in all cases the `dyn Write`s were just `Vec<u8>`s, replace the type with `Arc<Mutex<Vec<u8>>>`. This simplifies things more, as error handling and flushing can be removed now. This also removes the hack needed in the default panic handler to make this work with `::realstd`, as (unlike `Write`) `Vec<u8>` is from `alloc`, not `std`.
3. Replace the `RefCell`s by regular `Cell`s. The `RefCell`s were mostly used as `mem::replace(&mut *cell.borrow_mut(), something)`, which is just `Cell::replace`. This removes an unecessary bookkeeping and makes the code a bit easier to read.
4. Merge `set_panic` and `set_print` into a single `set_output_capture`. Neither the test crate nor rustc (the only users of this feature) have a use for using these separately. Merging them simplifies things even more. This uses a new function name and feature name, to make it clearer this is internal and not supposed to be used by other crates.
Might be easier to review per commit.
specialize io::copy to use copy_file_range, splice or sendfile
Fixes#74426.
Also covers #60689 but only as an optimization instead of an official API.
The specialization only covers std-owned structs so it should avoid the problems with #71091
Currently linux-only but it should be generalizable to other unix systems that have sendfile/sosplice and similar.
There is a bit of optimization potential around the syscall count. Right now it may end up doing more syscalls than the naive copy loop when doing short (<8KiB) copies between file descriptors.
The test case executes the following:
```
[pid 103776] statx(3, "", AT_STATX_SYNC_AS_STAT|AT_EMPTY_PATH, STATX_ALL, {stx_mask=STATX_ALL|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFREG|0644, stx_size=17, ...}) = 0
[pid 103776] write(4, "wxyz", 4) = 4
[pid 103776] write(4, "iklmn", 5) = 5
[pid 103776] copy_file_range(3, NULL, 4, NULL, 5, 0) = 5
```
0-1 `stat` calls to identify the source file type. 0 if the type can be inferred from the struct from which the FD was extracted
𝖬 `write` to drain the `BufReader`/`BufWriter` wrappers. only happen when buffers are present. 𝖬 ≾ number of wrappers present. If there is a write buffer it may absorb the read buffer contents first so only result in a single write. Vectored writes would also be an option but that would require more invasive changes to `BufWriter`.
𝖭 `copy_file_range`/`splice`/`sendfile` until file size, EOF or the byte limit from `Take` is reached. This should generally be *much* more efficient than the read-write loop and also have other benefits such as DMA offload or extent sharing.
## Benchmarks
```
OLD
test io::tests::bench_file_to_file_copy ... bench: 21,002 ns/iter (+/- 750) = 6240 MB/s [ext4]
test io::tests::bench_file_to_file_copy ... bench: 35,704 ns/iter (+/- 1,108) = 3671 MB/s [btrfs]
test io::tests::bench_file_to_socket_copy ... bench: 57,002 ns/iter (+/- 4,205) = 2299 MB/s
test io::tests::bench_socket_pipe_socket_copy ... bench: 142,640 ns/iter (+/- 77,851) = 918 MB/s
NEW
test io::tests::bench_file_to_file_copy ... bench: 14,745 ns/iter (+/- 519) = 8889 MB/s [ext4]
test io::tests::bench_file_to_file_copy ... bench: 6,128 ns/iter (+/- 227) = 21389 MB/s [btrfs]
test io::tests::bench_file_to_socket_copy ... bench: 13,767 ns/iter (+/- 3,767) = 9520 MB/s
test io::tests::bench_socket_pipe_socket_copy ... bench: 26,471 ns/iter (+/- 6,412) = 4951 MB/s
```
It was only ever used with Vec<u8> anyway. This simplifies some things.
- It no longer needs to be flushed, because that's a no-op anyway for
a Vec<u8>.
- Writing to a Vec<u8> never fails.
- No #[cfg(test)] code is needed anymore to use `realstd` instead of
`std`, because Vec comes from alloc, not std (like Write).