Commit Graph

72 Commits

Author SHA1 Message Date
bors
8cef65fde3 Auto merge of #77801 - fusion-engineering-forks:pin-mutex, r=Mark-Simulacrum
Enforce no-move rule of ReentrantMutex using Pin and fix UB in stdio

A `sys_common::ReentrantMutex` may not be moved after initializing it with `.init()`. This was not enforced, but only stated as a requirement in the comments on the unsafe functions. This change enforces this no-moving rule using `Pin`, by changing `&self` to a `Pin` in the `init()` and `lock()` functions.

This uncovered a bug I introduced in #77154: stdio.rs (the only user of ReentrantMutex) called `init()` on its ReentrantMutexes while constructing them in the intializer of `SyncOnceCell::get_or_init`, which would move them afterwards. Interestingly, the ReentrantMutex unit tests already had the same bug, so this invalid usage has been tested on all (CI-tested) platforms for a long time. Apparently this doesn't break badly on any of the major platforms, but it does break the rules.\*

To be able to keep using SyncOnceCell, this adds a `SyncOnceCell::get_or_init_pin` function, which makes it possible to work with pinned values inside a (pinned) SyncOnceCell. Whether this function should be public or not and what its exact behaviour and interface should be if it would be public is something I'd like to leave for a separate issue or PR. In this PR, this function is internal-only and marked with `pub(crate)`.

\* Note: That bug is now included in 1.48, while this patch can only make it to ~~1.49~~ 1.50. We should consider the implications of 1.48 shipping with a wrong usage of `pthread_mutex_t` / `CRITICAL_SECTION` / .. which technically invokes UB according to their specification. The risk is very low, considering the objects are not 'used' (locked) before the move, and the ReentrantMutex unit tests have verified this works fine in practice.

Edit: This has been backported and included in 1.48. And soon 1.49 too.

---

In future changes, I want to push this usage of Pin further inside `sys` instead of only `sys_common`, and apply it to all 'unmovable' objects there (`Mutex`, `Condvar`, `RwLock`). Also, while `sys_common`'s mutexes and condvars are already taken care of by #77147 and #77648, its `RwLock` should still be made movable or get pinned.
2020-12-10 23:43:20 +00:00
bors
2c56ea38b0 Auto merge of #78768 - mzabaluev:optimize-buf-writer, r=cramertj
Use is_write_vectored to optimize the write_vectored implementation for BufWriter

In case when the underlying writer does not have an efficient implementation `write_vectored`, the present implementation of
`write_vectored` for `BufWriter` may still forward vectored writes directly to the writer depending on the total length of the data. This misses the advantage of buffering, as the actually written slice may be small.

Provide an alternative code path for the non-vectored case, where the slices passed to `BufWriter` are coalesced in the buffer before being flushed to the underlying writer with plain `write` calls. The buffer is only bypassed if an individual slice's length is at least as large as the buffer.

Remove a FIXME comment referring to #72919 as the issue has been closed with an explanation provided.
2020-12-09 01:54:08 +00:00
Mara Bos
67c18fdec5 Use Pin for the 'don't move' requirement of ReentrantMutex.
The code in io::stdio before this change misused the ReentrantMutexes,
by calling init() on them and moving them afterwards. Now that
ReentrantMutex requires Pin for init(), this mistake is no longer easy
to make.
2020-12-08 22:57:57 +01:00
Mara Bos
2bc5d44ca9 Fix outdated comment about not needing to flush stderr. 2020-12-08 22:57:49 +01:00
Ian Jackson
b777552167 IntoInnerError: Provide into_error
Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
2020-12-04 18:43:02 +00:00
Ian Jackson
19c7619dcd IntoInnerError: Provide into_parts
In particular, IntoIneerError only currently provides .error() which
returns a reference, not an owned value.  This is not helpful and
means that a caller of BufWriter::into_inner cannot acquire an owned
io::Error which seems quite wrong.

Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
2020-12-04 18:43:02 +00:00
Ian Jackson
db5d697004 std: impl of Write for &mut [u8]: document the buffer full error
Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
2020-12-04 18:38:44 +00:00
Mikhail Zabaluev
674dd623ee Reduce branching in write_vectored for BufWriter
Do what write does and optimize for the most likely case:
slices are much smaller than the buffer. If a slice does not fit
completely in the remaining capacity of the buffer, it is left out
rather than buffered partially. Special treatment is only left for
oversized slices that are written directly to the underlying writer.
2020-11-22 17:05:14 +02:00
Mikhail Zabaluev
00deeb35c8 Fix is_write_vectored in LineWriterShim
Now that BufWriter always claims to support vectored writes,
look through it at the wrapped writer to decide whether to
use vectored writes for LineWriter.
2020-11-22 17:05:14 +02:00
Mikhail Zabaluev
9fc44239ec Make is_write_vectored return true for BufWriter
BufWriter provides an efficient implementation of
write_vectored also when the underlying writer does not
support vectored writes.
2020-11-22 17:05:13 +02:00
Mikhail Zabaluev
53196a8bcf Optimize write_vectored for BufWriter
If the underlying writer does not support efficient vectored output,
do it differently: always try to coalesce the slices in the buffer
until one comes that does not fit entirely. Flush the buffer before
the first slice if needed.
2020-11-22 17:05:13 +02:00
William Chargin
bdaa76cfde Fix typo in std::io::Write docs
These referred to a “`Write`er”—extra *e*. Presumably a copy-paste
holdover from “`Read`er”.

Test Plan:
Running ``git grep '`\?[Ww]rite`\?er'`` no longer finds any results.

wchargin-branch: io-write-docs
2020-11-17 15:32:23 -08:00
Mara Bos
11ce918c75
Rollup merge of #78714 - m-ou-se:simplify-local-streams, r=KodrAus
Simplify output capturing

This is a sequence of incremental improvements to the unstable/internal `set_panic` and `set_print` mechanism used by the `test` crate:

1. Remove the `LocalOutput` trait and use `Arc<Mutex<dyn Write>>` instead of `Box<dyn LocalOutput>`. In practice, all implementations of `LocalOutput` were just `Arc<Mutex<..>>`. This simplifies some logic and removes all custom `Sink` implementations such as `library/test/src/helpers/sink.rs`. Also removes a layer of indirection, as the outermost `Box` is now gone. It also means that locking now happens per `write_fmt`, not per individual `write` within. (So `"{} {}\n"` now results in one `lock()`, not four or more.)

2. Since in all cases the `dyn Write`s were just `Vec<u8>`s, replace the type with `Arc<Mutex<Vec<u8>>>`. This simplifies things more, as error handling and flushing can be removed now. This also removes the hack needed in the default panic handler to make this work with `::realstd`, as (unlike `Write`) `Vec<u8>` is from `alloc`, not `std`.

3. Replace the `RefCell`s by regular `Cell`s. The `RefCell`s were mostly used as `mem::replace(&mut *cell.borrow_mut(), something)`, which is just `Cell::replace`. This removes an unecessary bookkeeping and makes the code a bit easier to read.

4. Merge `set_panic` and `set_print` into a single `set_output_capture`. Neither the test crate nor rustc (the only users of this feature) have a use for using these separately. Merging them simplifies things even more. This uses a new function name and feature name, to make it clearer this is internal and not supposed to be used by other crates.

Might be easier to review per commit.
2020-11-16 17:26:27 +01:00
bors
30e49a9ead Auto merge of #75272 - the8472:spec-copy, r=KodrAus
specialize io::copy to use copy_file_range, splice or sendfile

Fixes #74426.
Also covers #60689 but only as an optimization instead of an official API.

The specialization only covers std-owned structs so it should avoid the problems with #71091

Currently linux-only but it should be generalizable to other unix systems that have sendfile/sosplice and similar.

There is a bit of optimization potential around the syscall count. Right now it may end up doing more syscalls than the naive copy loop when doing short (<8KiB) copies between file descriptors.

The test case executes the following:

```
[pid 103776] statx(3, "", AT_STATX_SYNC_AS_STAT|AT_EMPTY_PATH, STATX_ALL, {stx_mask=STATX_ALL|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFREG|0644, stx_size=17, ...}) = 0
[pid 103776] write(4, "wxyz", 4)        = 4
[pid 103776] write(4, "iklmn", 5)       = 5
[pid 103776] copy_file_range(3, NULL, 4, NULL, 5, 0) = 5

```

0-1 `stat` calls to identify the source file type. 0 if the type can be inferred from the struct from which the FD was extracted
𝖬 `write` to drain the `BufReader`/`BufWriter` wrappers. only happen when buffers are present. 𝖬 ≾ number of wrappers present. If there is a write buffer it may absorb the read buffer contents first so only result in a single write. Vectored writes would also be an option but that would require more invasive changes to `BufWriter`.
𝖭 `copy_file_range`/`splice`/`sendfile` until file size, EOF or the byte limit from `Take` is reached. This should generally be *much* more efficient than the read-write loop and also have other benefits such as DMA offload or extent sharing.

## Benchmarks

```

OLD

test io::tests::bench_file_to_file_copy         ... bench:      21,002 ns/iter (+/- 750) = 6240 MB/s    [ext4]
test io::tests::bench_file_to_file_copy         ... bench:      35,704 ns/iter (+/- 1,108) = 3671 MB/s  [btrfs]
test io::tests::bench_file_to_socket_copy       ... bench:      57,002 ns/iter (+/- 4,205) = 2299 MB/s
test io::tests::bench_socket_pipe_socket_copy   ... bench:     142,640 ns/iter (+/- 77,851) = 918 MB/s

NEW

test io::tests::bench_file_to_file_copy         ... bench:      14,745 ns/iter (+/- 519) = 8889 MB/s    [ext4]
test io::tests::bench_file_to_file_copy         ... bench:       6,128 ns/iter (+/- 227) = 21389 MB/s   [btrfs]
test io::tests::bench_file_to_socket_copy       ... bench:      13,767 ns/iter (+/- 3,767) = 9520 MB/s
test io::tests::bench_socket_pipe_socket_copy   ... bench:      26,471 ns/iter (+/- 6,412) = 4951 MB/s
```
2020-11-14 12:01:55 +00:00
The8472
888b1031bc limit visibility of copy offload helpers to sys::unix module 2020-11-13 22:38:27 +01:00
The8472
18bfe2a66b move copy specialization tests to their own module 2020-11-13 22:38:27 +01:00
The8472
7f5d2722af move copy specialization into sys::unix module 2020-11-13 22:38:23 +01:00
The8472
ad9b07c7e5 add benchmarks 2020-11-13 19:46:37 +01:00
The8472
46e7fbe60b reduce syscalls by inferring FD types based on source struct instead of calling stat()
also adds handling for edge-cases involving large sparse files where sendfile could fail with EOVERFLOW
2020-11-13 19:46:35 +01:00
The8472
0624730d9e add forwarding specializations for &mut variants
`impl Write for &mut T where T: Write`, thus the same should
apply to the specialization traits
2020-11-13 19:45:38 +01:00
The8472
cd3bddc044 prioritize sendfile over splice since it results in fewer context switches when sending to pipes
splice returns to userspace when the pipe is full, sendfile
just blocks until it's done, this can achieve much higher throughput
2020-11-13 19:45:38 +01:00
The8472
67a6059aa5 move tests module into separate file 2020-11-13 19:45:38 +01:00
The8472
5eb88fa5c7 hide unused exports on other platforms 2020-11-13 19:45:38 +01:00
The8472
16236470c1 specialize io::copy to use copy_file_range, splice or sendfile
Currently it only applies to linux systems. It can be extended to make use
of similar syscalls on other unix systems.
2020-11-13 19:45:27 +01:00
Mara Bos
aff7bd66e8 Merge set_panic and set_print into set_output_capture.
There were no use cases for setting them separately.
Merging them simplifies some things.
2020-11-10 21:58:13 +01:00
Mara Bos
08b7cb79e0 Use Cell instead of RefCell for LOCAL_{STDOUT,STDERR}. 2020-11-10 21:58:13 +01:00
Mara Bos
f534b75f05 Use Vec<u8> for LOCAL_STD{OUT,ERR} instead of dyn Write.
It was only ever used with Vec<u8> anyway. This simplifies some things.

- It no longer needs to be flushed, because that's a no-op anyway for
  a Vec<u8>.

- Writing to a Vec<u8> never fails.

- No #[cfg(test)] code is needed anymore to use `realstd` instead of
  `std`, because Vec comes from alloc, not std (like Write).
2020-11-10 21:58:09 +01:00
Mara Bos
72e96604c0 Remove io::LocalOutput and use Arc<Mutex<dyn>> for local streams. 2020-11-10 21:57:05 +01:00
Mara Bos
77f333b304
Rollup merge of #78811 - a1phyr:const_io_structs, r=dtolnay
Make some std::io functions `const`

Tracking issue: #78812

Make the following functions `const`:
- `io::Cursor::new`
- `io::Cursor::get_ref`
- `io::Cursor::position`
- `io::empty`
- `io::repeat`
- `io::sink`

r? `````@dtolnay`````
2020-11-08 13:36:19 +01:00
Benoît du Garreau
001dd7e6a5 Add tracking issue 2020-11-06 18:04:52 +01:00
Benoît du Garreau
ae059b532f Make some std::io functions const
Includes:
- io::Cursor::new
- io::Cursor::get_ref
- io::Cursor::position
- io::empty
- io::repeat
- io::sink
2020-11-06 17:48:26 +01:00
Peter Jaszkowiak
8d48e3bbb2 document HACKs 2020-11-05 19:26:08 -07:00
Peter Jaszkowiak
fe6dfcd28a Intra-doc links for std::io::buffered 2020-11-05 19:09:42 -07:00
bors
56d288fa46 Auto merge of #78227 - SergioBenitez:test-stdout-threading, r=m-ou-se
Capture output from threads spawned in tests

This is revival of #75172.

Original text:
> Fixes #42474.
>
> r? `@​dtolnay` since you expressed interest in this, but feel free to redirect if you aren't the right person anymore.

---

Closes #75172.
2020-10-27 11:43:18 +00:00
Michele Lacchia
a4ba179bdd
fix(docs): typo in BufWriter documentation 2020-10-26 11:13:47 +01:00
Sergio Benitez
db15596c57 Only load LOCAL_STREAMS if they are being used 2020-10-22 18:15:48 -07:00
Tyler Mandry
d0d0e78208 Capture output from threads spawned in tests
Fixes #42474.
2020-10-22 18:15:44 -07:00
Dylan DPC
5acb7f198f
Rollup merge of #76084 - Lucretiel:split-buffered, r=dtolnay
Refactor io/buffered.rs into submodules

This pull request splits `BufWriter`, `BufReader`, `LineWriter`, and `LineWriterShim` (along with their associated tests) into separate submodules. It contains no functional changes. This change is being made in anticipation of adding another type of buffered writer which can be switched between line- and block-buffering mode.

Part of a series of pull requests resolving #60673.
2020-10-16 02:10:04 +02:00
Mara Bos
de597fca40 Optimize set_{panic,print}(None). 2020-09-27 16:04:25 +02:00
Mara Bos
ed3ead013f Relax memory ordering of LOCAL_STREAMS and document it. 2020-09-27 16:04:25 +02:00
Mara Bos
07fd17f701 Only use LOCAL_{STDOUT,STDERR} when set_{print/panic} is used.
The thread local LOCAL_STDOUT and LOCAL_STDERR are only used by the test
crate to capture output from tests when running them in the same process
in differen threads. However, every program will check these variables
on every print, even outside of testing.

This involves allocating a thread local key, and registering a thread
local destructor. This can be somewhat expensive.

This change keeps a global flag (LOCAL_STREAMS) which will be set to
true when either of these local streams is used. (So, effectively only
in test and benchmark runs.) When this flag is off, these thread locals
are not even looked at and therefore will not be initialized on the
first output on every thread, which also means no thread local
destructors will be registered.
2020-09-27 16:04:25 +02:00
bors
c9e5e6a53a Auto merge of #77154 - fusion-engineering-forks:lazy-stdio, r=dtolnay
Remove std::io::lazy::Lazy in favour of SyncOnceCell

The (internal) std::io::lazy::Lazy was used to lazily initialize the stdout and stdin buffers (and mutexes). It uses atexit() to register a destructor to flush the streams on exit, and mark the streams as 'closed'. Using the stream afterwards would result in a panic.

Stdout uses a LineWriter which contains a BufWriter that will flush the buffer on drop. This one is important to be executed during shutdown, to make sure no buffered output is lost. It also forbids access to stdout afterwards, since the buffer is already flushed and gone.

Stdin uses a BufReader, which does not implement Drop. It simply forgets any previously read data that was not read from the buffer yet. This means that in the case of stdin, the atexit() function's only effect is making stdin inaccessible to the program, such that later accesses result in a panic. This is uncessary, as it'd have been safe to access stdin during shutdown of the program.

---

This change removes the entire io::lazy module in favour of SyncOnceCell. SyncOnceCell's fast path is much faster (a single atomic operation) than locking a sys_common::Mutex on every access like Lazy did.

However, SyncOnceCell does not use atexit() to drop the contained object during shutdown.

As noted above, this is not a problem for stdin. It simply means stdin is now usable during shutdown.

The atexit() call for stdout is moved to the stdio module. Unlike the now-removed Lazy struct, SyncOnceCell does not have a 'gone and unusable' state that panics. Instead of adding this again, this simply replaces the buffer with one with zero capacity. This effectively flushes the old buffer *and* makes any writes afterwards pass through directly without touching a buffer, making print!() available during shutdown without panicking.

---

In addition, because the contents of the SyncOnceCell are no longer dropped, we can now use `&'static` instead of `Arc` in `Stdout` and `Stdin`. This also saves two levels of indirection in `stdin()` and `stdout()`, since Lazy effectively stored a `Box<Arc<T>>`, and SyncOnceCell stores the `T` directly.
2020-09-27 04:50:46 +00:00
Mara Bos
6f9c1323a7 Call ReentrantMutex::init() in stdout(). 2020-09-24 19:25:21 +02:00
Mara Bos
45700a9d58 Drop use of Arc from Stdin and Stdout. 2020-09-24 19:09:33 +02:00
Mara Bos
bab15f773a Remove std::io::lazy::Lazy in favour of SyncOnceCell
The (internal) std::io::lazy::Lazy was used to lazily initialize the
stdout and stdin buffers (and mutexes). It uses atexit() to register a
destructor to flush the streams on exit, and mark the streams as
'closed'. Using the stream afterwards would result in a panic.

Stdout uses a LineWriter which contains a BufWriter that will flush the
buffer on drop. This one is important to be executed during shutdown,
to make sure no buffered output is lost. It also forbids access to
stdout afterwards, since the buffer is already flushed and gone.

Stdin uses a BufReader, which does not implement Drop. It simply forgets
any previously read data that was not read from the buffer yet. This
means that in the case of stdin, the atexit() function's only effect is
making stdin inaccessible to the program, such that later accesses
result in a panic. This is uncessary, as it'd have been safe to access
stdin during shutdown of the program.

---

This change removes the entire io::lazy module in favour of
SyncOnceCell. SyncOnceCell's fast path is much faster (a single atomic
operation) than locking a sys_common::Mutex on every access like Lazy
did.

However, SyncOnceCell does not use atexit() to drop the contained object
during shutdown.

As noted above, this is not a problem for stdin. It simply means stdin
is now usable during shutdown.

The atexit() call for stdout is moved to the stdio module. Unlike the
now-removed Lazy struct, SyncOnceCell does not have a 'gone and
unusable' state that panics. Instead of adding this again, this simply
replaces the buffer with one with zero capacity. This effectively
flushes the old buffer *and* makes any writes afterwards pass through
directly without touching a buffer, making print!() available during
shutdown without panicking.
2020-09-24 18:18:48 +02:00
ecstatic-morse
65bdf79da3
Rollup merge of #76275 - FedericoPonzi:immutable-write-impl-73836, r=dtolnay
Implementation of Write for some immutable ref structs

Fixes  #73836
2020-09-21 20:40:44 -07:00
Federico Ponzi
88a29e630c
Updates stability attributes to the current nightly version 2020-09-21 08:52:59 +02:00
Federico Ponzi
ec7f9b927f
Deduplicates io::Write implementations 2020-09-11 11:39:31 +02:00
Nathan West
96229f0240 move buffered.rs to mod.rs 2020-09-10 23:48:22 -04:00
Nathan West
a020142805 Refactor io/buffered.rs into submodules 2020-09-10 23:39:55 -04:00