The current implementation uses a `hlt` instruction, which is the most
direct way to notify a connected debugger but is not the most flexible
way. This commit changes it to a call to the `abort` libc function,
making it possible for a system designer to override its behavior as
they see fit.
make windows compat_fn (crudely) work on Miri
With https://github.com/rust-lang/rust/pull/95469, Windows `compat_fn!` now has to be supported by Miri to even make stdout work. Unfortunately, it relies on some outside-of-Rust linker hacks (`#[link_section = ".CRT$XCU"]`) that are rather hard to make work in Miri. So I came up with this crude hack to make this stuff work in Miri regardless. It should come at no cost for regular executions, so I hope this is okay.
Cc https://github.com/rust-lang/rust/issues/95627 `@ChrisDenton`
Update libc to 0.2.121
With the updated libc, UNIX stack overflow handling in libstd can now
use the common `si_addr` accessor function, rather than attempting to
use a field from that name in `siginfo_t`. This simplifies the
collection of the fault address, particularly on platforms where that
data resides within a union in `siginfo_t`.
Don't cast thread name to an integer for prctl
`libc::prctl` and the `prctl` definitions in glibc, musl, and the kernel headers are C variadic functions. Therefore, all the arguments (except for the first) are untyped. It is only the Linux man page which says that `prctl` takes 4 `unsigned long` arguments. I have no idea why it says this.
In any case, the upshot is that we don't need to cast the pointer to an integer and confuse Miri.
But in light of this... what are we doing with those three `0`s? We're passing 3 `i32`s to `prctl`, which doesn't fill me with confidence. The man page says `unsigned long` and all the constants in the linux kernel are macros for expressions of the form `1UL << N`. I'm mostly commenting on this because looks a whole lot like some UB that was found in SQLite a few years ago: <https://youtu.be/LbzbHWdLAI0?t=1925> that was related to accidentally passing a 32-bit value from a literal `0` instead of a pointer-sized value. This happens to work on x86 due to the size of pointers and happens to work on x86_64 due to the calling convention. But also, there is no good reason for an implementation to be looking at those arguments. Some other calls to `prctl` require that other arguments be zeroed, but not `PR_SET_NAME`... so why are we even passing them?
I would prefer to end such questions by either passing 3 `libc::c_ulong`, or not passing those at all, but I'm not sure which is better.
Fix unsound `File` methods
This is a draft attempt to fix#81357. *EDIT*: this PR now tackles `read()`, `write()`, `read_at()`, `write_at()` and `read_buf`. Still needs more testing though.
cc `@jstarks,` can you confirm the the Windows team is ok with the Rust stdlib using `NtReadFile` and `NtWriteFile`?
~Also, I'm provisionally using `CancelIo` in a last ditch attempt to recover but I'm not sure that this is actually a good idea. Especially as getting into this state would be a programmer error so aborting the process is justified in any case.~ *EDIT*: removed, see comments.
With the updated libc, UNIX stack overflow handling in libstd can now
use the common `si_addr` accessor function, rather than attempting to
use a field from that name in `siginfo_t`. This simplifies the
collection of the fault address, particularly on platforms where that
data resides within a union in `siginfo_t`.
Windows: Synchronize asynchronous pipe reads and writes
On Windows, the pipes used for spawned processes are opened for asynchronous access but `read` and `write` are done using the standard methods that assume synchronous access. This means that the buffer (and variables on the stack) may be read/written to after the function returns.
This PR ensures reads/writes complete before returning. Note that this only applies to pipes we create and does not affect the standard file read/write methods.
Fixes#95411
libc::prctl and the prctl definitions in glibc, musl, and the kernel
headers are C variadic functions. Therefore, all the arguments (except
for the first) are untyped. It is only the Linux man page which says
that prctl takes 4 unsigned long arguments. I have no idea why it says
this.
In any case, the upshot is that we don't need to cast the pointer to an
integer and confuse Miri.
Move std::sys::{mutex, condvar, rwlock} to std::sys::locks.
This cleans up the the std::sys modules a bit by putting the locks in a single module called `locks` rather than spread over the three modules `mutex`, `condvar`, and `rwlock`. This makes it easier to organise lock implementations, which helps with https://github.com/rust-lang/rust/issues/93740.
Skip a test if symlink creation is not possible
If someone running tests on Windows does not have Developer Mode enabled then creating symlinks will fail which in turn would cause this test to fail. This can be a stumbling block for contributors.
remove_dir_all: use fallback implementation on Miri
Fixes https://github.com/rust-lang/miri/issues/1966
The new implementation requires `openat`, `unlinkat`, and `fdopendir`. These cannot easily be shimmed in Miri since libstd does not expose APIs corresponding to them. So for now it is probably easiest to just use the fallback code in Miri. Nobody should run Miri as root anyway...
Add a `process_group` method to UNIX `CommandExt`
- Tracking issue: #93857
- RFC: https://github.com/rust-lang/rfcs/pull/3228
Add a `process_group` method to `std::os::unix::process::CommandExt` that
allows setting the process group id (i.e. calling `setpgid`) in the child, thus
enabling users to set process groups while leveraging the `posix_spawn` fast
path.
Update stdlib for the l4re target
This PR contains the work by ``@humenda`` and myself to update standard library support for the x86_64-unknown-l4re-uclibc tier 3 target, split out from humenda/rust as requested in #85967. The changes have been rebased on current master and updated in follow up commits by myself. The publishing of the changes is authorized and preferred by the original author. To preserve attribution, when standard library changes were introduced as part of other changes to the compiler, I have kept the changes concerning the standard library and altered the commit messages as indicated. Any incompatibilities have been remedied in follow up commits, so that the PR as a whole should result in a clean update of the target.
Use verbatim paths for `process::Command` if necessary
In #89174, the standard library started using verbatim paths so longer paths are usable by default. However, `Command` was originally left out because of the way `CreateProcessW` was being called. This was changed as a side effect of #87704 so now `Command` paths can be converted to verbatim too (if necessary).
This updates the standard library's documentation to use the new syntax. The
documentation is worthwhile to update as it should be more idiomatic
(particularly for features like this, which are nice for users to get acquainted
with). The general codebase is likely more hassle than benefit to update: it'll
hurt git blame, and generally updates can be done by folks updating the code if
(and when) that makes things more readable with the new format.
A few places in the compiler and library code are updated (mostly just due to
already having been done when this commit was first authored).
L4Re provides limited POSIX support which includes support for
standard I/O streams, and a limited implementation of the standard file
handling API. However, because as a capability based OS it strives to
only make a local view available to each application, there are
currently no standardized special files like /dev/null that could serve
to sanitize closed standard FDs.
For now, skip any attempts to sanitize standard streams until a more
complete POSIX runtime is available.
unix: reduce the size of DirEntry
On platforms where we call `readdir` instead of `readdir_r`, we store
the name as an allocated `CString` for variable length. There's no point
carrying around a full `dirent64` with its fixed-length `d_name` too.
unix: Avoid name conversions in `remove_dir_all_recursive`
Each recursive call was creating an `OsString` for a `&Path`, only for
it to be turned into a `CString` right away. Instead we can directly
pass `.name_cstr()`, saving two allocations each time.
On platforms where we call `readdir` instead of `readdir_r`, we store
the name as an allocated `CString` for variable length. There's no point
carrying around a full `dirent64` with its fixed-length `d_name` too.
Each recursive call was creating an `OsString` for a `&Path`, only for
it to be turned into a `CString` right away. Instead we can directly
pass `.name_cstr()`, saving two allocations each time.
Unix path::absolute: Fix leading "." component
Testing leading `.` and `..` components were missing from the unix tests.
This PR adds them and fixes the leading `.` case. It also fixes the test cases so that they do an exact comparison.
This problem reported by ``@axetroy``
UNIX `remove_dir_all()`: Try recursing first on the slow path
This only affects the _slow_ code path - if there is no `dirent.d_type` or if it is `DT_UNKNOWN`.
POSIX specifies that calling `unlink()` or `unlinkat(..., 0)` on a directory is allowed to succeed:
> The _path_ argument shall not name a directory unless the process has appropriate privileges and the implementation supports using _unlink()_ on directories.
This however can cause dangling inodes requiring an fsck e.g. on Illumos UFS, so we have to avoid that in the common case. We now just try to recurse into it first and unlink() if we can't open it as a directory.
The other two commits integrate the Macos x86-64 implementation reducing redundancy. Split into two commits for better reviewing.
Fixes#94335.
This only affects the `slow` code path, if there is no `dirent.d_type` or if
the type is `DT_UNKNOWN`.
POSIX specifies that calling `unlink()` or `unlinkat(..., 0)` on a directory can
succeed:
> "The _path_ argument shall not name a directory unless the process has
> appropriate privileges and the implementation supports using _unlink()_ on
> directories."
This however can cause orphaned directories requiring an fsck e.g. on Illumos
UFS, so we have to avoid that in the common case. We now just try to recurse
into it first and unlink() if we can't open it as a directory.
Use `HandleOrNull` and `HandleOrInvalid` in the Windows FFI bindings.
Use the new `HandleOrNull` and `HandleOrInvalid` types that were introduced
as part of [I/O safety] in a few functions in the Windows FFI bindings.
This factors out an `unsafe` block and two `unsafe` function calls in the
Windows implementation code.
And, it helps test `HandleOrNull` and `HandleOrInvalid`, and indeed, it
turned up a bug: `OwnedHandle` also needs to be `#[repr(transparent)]`,
as it's used inside of `HandleOrNull` and `HandleOrInvalid` which are also
`#[repr(transparent)]`.
r? ```@joshtriplett```
[I/O safety]: https://github.com/rust-lang/rust/issues/87074
Use the new `HandleOrNull` and `HandleOrInvalid` types that were introduced
as part of [I/O safety] in a few functions in the Windows FFI bindings.
This factors out an `unsafe` block and two `unsafe` function calls in the
Windows implementation code.
And, it helps test `HandleOrNull` and `HandleOrInvalid`, which indeed turned
up a bug: `OwnedHandle` also needs to be `#[repr(transparent)]`, as it's
used inside of `HandleOrNull` and `HandleOrInvalid` which are also
`#[repr(transparent)]`.
[I/O safety]: https://github.com/rust-lang/rust/issues/87074
Rename `BorrowedFd::borrow_raw_fd` to `BorrowedFd::borrow_raw`.
Also, rename `BorrowedHandle::borrow_raw_handle` and
`BorrowedSocket::borrow_raw_socket` to `BorrowedHandle::borrow_raw` and
`BorrowedSocket::borrow_raw`.
This is just a minor rename to reduce redundancy in the user code calling
these functions, and to eliminate an inessential difference between
`BorrowedFd` code and `BorrowedHandle`/`BorrowedSocket` code.
While here, add a simple test exercising `BorrowedFd::borrow_raw_fd`.
r? ``````@joshtriplett``````
this avoids parsing mountinfo which can be huge on some systems and
something might be emulating cgroup fs for sandboxing reasons which means
it wouldn't show up as mountpoint
additionally the new implementation operates on a single pathbuffer, reducing allocations
Manually tested via
```
// spawn a new cgroup scope for the current user
$ sudo systemd-run -p CPUQuota="300%" --uid=$(id -u) -tdS
// quota.rs
#![feature(available_parallelism)]
fn main() {
println!("{:?}", std:🧵:available_parallelism()); // prints Ok(3)
}
```
Caveats
* cgroup v1 is ignored
* funky mountpoints (containing spaces, newlines or control chars) for cgroupfs will not be handled correctly since that would require unescaping /proc/self/mountinfo
The escaping behavior of procfs seems to be undocumented. systemd and docker default to `/sys/fs/cgroup` so it should be fine for most systems.
* quota will be ignored when `sched_getaffinity` doesn't work
* assumes procfs is mounted under `/proc` and cgroupfs mounted and readable somewhere in the directory tree
The ability to interoperate with C code via FFI is not limited to crates
using std; this allows using these types without std.
The existing types in `std::os::raw` become type aliases for the ones in
`core::ffi`. This uses type aliases rather than re-exports, to allow the
std types to remain stable while the core types are unstable.
This also moves the currently unstable `NonZero_` variants and
`c_size_t`/`c_ssize_t`/`c_ptrdiff_t` types to `core::ffi`, while leaving
them unstable.
use BOOL for TCP_NODELAY setsockopt value on Windows
This issue was found by the Wine project and mitigated there [^1].
Windows' setsockopt expects a BOOL (a typedef for int) for TCP_NODELAY
[^2]. Windows itself is forgiving and will accept any positive optlen and
interpret the first byte of *optval as the value, so this bug does not
affect Windows itself, but does affect systems implementing Windows'
interface more strictly, such as Wine. Wine was previously passing this
through to the host's setsockopt, where, e.g., Linux requires that
optlen be correct for the chosen option, and TCP_NODELAY expects an int.
[^1]: d6ea38f32d
[^2]: https://docs.microsoft.com/en-us/windows/win32/api/winsock/nf-winsock-setsockopt
This issue was found by the Wine project and mitigated there [1].
Windows' documented interface for `setsockopt` expects a `BOOL` (a
`typedef` for `int`) for `TCP_NODELAY` [2]. Windows is forgiving and
will accept any positive length and interpret the first byte of
`*option_value` as the value, so this bug does not affect Windows
itself, but does affect systems implementing Windows' interface more
strictly, such as Wine. Wine was previously passing this through to the
host's `setsockopt`, where, e.g., Linux requires that `option_len` be
correct for the chosen option, and `TCP_NODELAY` expects an `int`.
[1]: d6ea38f32d
[2]: https://docs.microsoft.com/en-us/windows/win32/api/winsock/nf-winsock-setsockopt
removing architecture requirements for RustyHermit
RustHermit and HermitCore is able to run on aarch64 and x86_64. In the future these operating systems will also support RISC-V. Consequently, the dependency to a specific target should be removed.
The build process of `hermit-abi` fails if the architecture isn't supported.
kmc-solid: Use the filesystem thread-safety wrapper
Fixes the thread unsafety of the `std::fs` implementation used by the [`*-kmc-solid_*`](https://doc.rust-lang.org/nightly/rustc/platform-support/kmc-solid.html) Tier 3 targets.
Neither the SOLID filesystem API nor built-in filesystem drivers guarantee thread safety by default. Although this may suffice in general embedded-system use cases, and in fact the API can be used from multiple threads without any problems in many cases, this has been a source of unsoundness in `std::sys::solid::fs`.
This commit updates the implementation to leverage the filesystem thread-safety wrapper (which uses a pluggable synchronization mechanism) to enforce thread safety. This is done by prefixing all paths passed to the filesystem API with `\TS`. (Note that relative paths aren't supported in this platform.)
This removes all mutex/atomics based workarounds for non-monotonic clocks and makes the previously panicking methods saturating instead.
Effectively this moves the monotonization from `Instant` construction to the comparisons.
This has some observable effects, especially on platforms without monotonic clocks:
* Incorrectly ordered Instant comparisons no longer panic. This may hide some programming errors until someone actually looks at the resulting `Duration`
* `checked_duration_since` will now return `None` in more cases. Previously it only happened when one compared instants obtained in the wrong order or
manually created ones. Now it also does on backslides.
The upside is reduced complexity and lower overhead of `Instant::now`.
Rename `FilenameTooLong` to `InvalidFilename` and also use it for Windows' `ERROR_INVALID_NAME`
Address https://github.com/rust-lang/rust/issues/90940#issuecomment-970157931
`ERROR_INVALID_NAME` (i.e. "The filename, directory name, or volume label syntax is incorrect") happens if we pass an invalid filename, directory name, or label syntax, so mapping as `InvalidInput` is reasonable to me.
kmc-solid: Fix wait queue manipulation errors in the `Condvar` implementation
This PR fixes a number of bugs in the `Condvar` wait queue implementation used by the [`*-kmc-solid_*`](https://doc.rust-lang.org/nightly/rustc/platform-support/kmc-solid.html) Tier 3 targets. These bugs can occur when there are multiple threads waiting on the same `Condvar` and sometimes manifest as an `unwrap` failure.
Neither the SOLID filesystem API nor built-in filesystems guarantee
thread safety by default. Although this may suffice in general embedded-
system use cases, and in fact the API can be used from multiple threads
without any problems in many cases, this has been a source of
unsoundness in `std::sys::solid::fs`.
This commit updates the `std` code to leverage the filesystem thread-
safety wrapper to enforce thread safety. This is done by prefixing all
paths passed to the filesystem API with `\TS`. (Note that relative paths
aren't supported in this platform.)
Use `NtCreateFile` instead of `NtOpenFile` to open a file
Generally the internal `Nt*` functions should be avoided but when we do need to use one we should stick to the most commonly used for the job. To that end, this PR replaces `NtOpenFile` with `NtCreateFile`.
NOTE: The initial version of this comment hypothesised that this may help with some recent false positives from malware scanners. This hypothesis proved wrong. Sorry for the distraction.
Make io::Error use 64 bits on targets with 64 bit pointers.
I've wanted this for a long time, but didn't see a good way to do it without having extra allocation. When looking at it yesterday, it was more clear what to do for some reason.
This approach avoids any additional allocations, and reduces the size by half (8 bytes, down from 16). AFAICT it doesn't come additional runtime cost, and the compiler seems to do a better job with code using it.
Additionally, this `io::Error` has a niche (still), so `io::Result<()>` is *also* 64 bits (8 bytes, down from 16), and `io::Result<usize>` (used for lots of io trait functions) is 2x64 bits (16 bytes, down from 24 — this means on x86_64 it can use the nice rax/rdx 2-reg struct return). More generally, it shaves a whole 64 bit integer register off of the size of basically any `io::Result<()>`.
(For clarity: Improving `io::Result` (rather than io::Error) was most of the motivation for this)
On 32 bit (or other non-64bit) targets we still use something equivalent the old repr — I don't think think there's improving it, since one of the fields it stores is a `i32`, so we can't get below that, and it's already about as close as we can get to it.
---
### Isn't Pointer Tagging Dodgy?
The details of the layout, and why its implemented the way it is, are explained in the header comment of library/std/src/io/error/repr_bitpacked.rs. There's probably more details than there need to be, but I didn't trim it down that much, since there's a lot of stuff I did deliberately, that might have not seemed that way.
There's actually only one variant holding a pointer which gets tagged. This one is the (holder for the) user-provided error.
I believe the scheme used to tag it is not UB, and that it preserves pointer provenance (even though often pointer tagging does not) because the tagging operation is just `core::ptr::add`, and untagging is `core::ptr::sub`. The result of both operations lands inside the original allocation, so it would follow the safety contract of `core::ptr::{add,sub}`.
The other pointer this had to encode is not tagged — or rather, the tagged repr is equivalent to untagged (it's tagged with 0b00, and has >=4b alignment, so we can reuse the bottom bits). And the other variants we encode are just integers, which (which can be untagged using bitwise operations without worry — they're integers).
CC `@RalfJung` for the stuff in repr_bitpacked.rs, as my comments are informed by a lot of the UCG work, but it's possible I missed something or got it wrong (even if the implementation is okay, there are parts of the header comment that says things like "We can't do $x" which could be false).
---
### Why So Many Changes?
The repr change was mostly internal, but changed one widely used API: I had to switch how `io::Error::new_const` works.
This required switching `io::Error::new_const` to take the full message data (including the kind) as a `&'static`, rather than just the string. This would have been really tedious, but I made a macro that made it much simpler, but it was a wide change since `io::Error::new_const` is used everywhere.
This included changing files for a lot of targets I don't have easy access to (SGX? Haiku? Windows? Who has heard of these things), so I expect there to be spottiness in CI initially, unless luck is on my side.
Anyway this large only tangentially-related change is all in the first commit (although that commit also pulls the previous repr out into its own file), whereas the packing stuff is all in commit 2.
---
P.S. I haven't looked at all of this since writing it, and will do a pass over it again later, sorry for any obvious typos or w/e. I also definitely repeat myself in comments and such.
(It probably could use more tests too. I did some basic testing, and made it so we `debug_assert!` in cases the decode isn't what we encoded, but I don't know the degree which I can assume libstd's testing of IO would exercise this. That is: it wouldn't be surprising to me if libstds IO testing were minimal, especially around error cases, although I have no idea).
Also, rename `BorrowedHandle::borrow_raw_handle` and
`BorrowedSocket::borrow_raw_socket` to `BorrowedHandle::borrow_raw` and
`BorrowedSocket::borrow_raw`.
This is just a minor rename to reduce redundancy in the user code calling
these functions, and to eliminate an inessential difference between
`BorrowedFd` code and `BorrowedHandle`/`BorrowedSocket` code.
While here, add a simple test exercising `BorrowedFd::borrow_raw_fd`.
kmc-solid: Fix off-by-one error in `SystemTime::now`
Fixes a miscalculation of `SystemTime` on the [`*-kmc-solid_*`](https://doc.rust-lang.org/nightly/rustc/platform-support/kmc-solid.html) Tier 3 targets.
Unlike the identically-named libc counterpart `tm::tm_mon`, `SOLID_RTC_TIME::tm_mon` contains a 1-based month number.
kmc-solid: Increase the default stack size
This PR increases the default minimum stack size on the [`*-kmc-solid_*`](https://doc.rust-lang.org/nightly/rustc/platform-support/kmc-solid.html) Tier 3 targets to 64KiB (Arm) and 128KiB (AArch64).
This value was chosen as a middle ground between supporting a relatively complex program (e.g., an application using a full-fledged off-the-shelf web server framework) with no additional configuration and minimizing resource consumption for the embedded platform that doesn't support lazily-allocated pages nor over-commitment (i.e., wasted stack spaces are wasted physical memory). If the need arises, the users can always set the `RUST_MIN_STACK` environmental variable to override the default stack size or use the platform API directly.
Errors from pthread_sigmask(3) were handled using cvt(), which expects a
return value of -1 on error and uses errno.
However, pthread_sigmask(3) returns 0 on success and an error number
otherwise.
Fix it by replacing cvt() with cvt_nz().
kmc-solid: Inherit the calling task's base priority in `Thread::new`
This PR fixes the initial priority calculation of spawned threads on the [`*-kmc-solid_*`](https://doc.rust-lang.org/nightly/rustc/platform-support/kmc-solid.html) Tier 3 targets.
Fixes a spawned task (an RTOS object on top of which threads are implemented for this target; unrelated to async tasks) getting an unexpectedly higher priority if it's spawned by a task whose priority is temporarily boosted by a priority-protection mutex.
unix: Use metadata for `DirEntry::file_type` fallback
When `DirEntry::file_type` fails to match a known `d_type`, we should
fall back to `DirEntry::metadata` instead of a bare `lstat`, because
this is faster and more reliable on targets with `fstatat`.
Fixes a spawned task getting an unexpectedly higher priority if it's
spawned by a task whose priority is temporarily boosted by a priority-
protection mutex.
When `DirEntry::file_type` fails to match a known `d_type`, we should
fall back to `DirEntry::metadata` instead of a bare `lstat`, because
this is faster and more reliable on targets with `fstatat`.
fs: Don't copy d_name from struct dirent
The dirent returned from readdir() is only guaranteed to be valid for
d_reclen bytes on common platforms. Since we copy the name separately
anyway, we can copy everything except d_name into DirEntry::entry.
Fixes#93384.
The dirent returned from readdir() is only guaranteed to be valid for
d_reclen bytes on common platforms. Since we copy the name separately
anyway, we can copy everything except d_name into DirEntry::entry.
Fixes#93384.
kmc-solid: Implement `net::FileDesc::duplicate`
This PR implements `std::sys::solid::net::FileDesc::duplicate`, which was accidentally left out when this target was added by #86191.
Bump libc and fix remove_dir_all on Fuchsia after CVE fix
With the previous `is_dir` impl, we would attempt to unlink
a directory in the None branch, but Fuchsia supports returning
ENOTEMPTY from unlinkat() without the AT_REMOVEDIR flag because
we don't currently differentiate unlinking files and directories
by default.
On the Fuchsia side I've opened https://fxbug.dev/92273 to discuss
whether this is the correct behavior, but it doesn't seem like
addressing the error code is necessary to make our tests happy.
Depends on https://github.com/rust-lang/libc/pull/2654 since we
apparently haven't needed to reference DT_UNKNOWN before this.
With the previous `is_dir` impl, we would attempt to unlink
a directory in the None branch, but Fuchsia supports returning
ENOTEMPTY from unlinkat() without the AT_REMOVEDIR flag because
we don't currently differentiate unlinking files and directories
by default.
On the Fuchsia side I've opened https://fxbug.dev/92273 to discuss
whether this is the correct behavior, but it doesn't seem like
addressing the error code is necessary to make our tests happy.
Updates std's libc crate to include DT_UNKNOWN for Fuchsia.
With the addition of `sock_accept()` to snapshot1, simple networking via
a passed `TcpListener` is possible. This patch implements the basics to
make a simple server work.
Signed-off-by: Harald Hoyer <harald@profian.com>
Add a `try_clone()` function to `OwnedFd`.
As suggested in #88564. This adds a `try_clone()` to `OwnedFd` by
refactoring the code out of the existing `File`/`Socket` code.
r? ``@joshtriplett``
Fix STD compilation for the ESP-IDF target (regression from CVE-2022-21658)
Commit 54e22eb7db broke the compilation of STD for the ESP-IDF embedded "unix-like" Tier 3 target, because the fix for [CVE-2022-21658](https://blog.rust-lang.org/2022/01/20/Rust-1.58.1.html) uses [libc flags](https://github.com/esp-rs/esp-idf-svc/runs/4892221554?check_suite_focus=true) which are not supported on the ESP-IDF platform.
This PR simply redirects the ESP-IDF compilation to the "classic" implementation, similar to REDOX. This should be safe because:
* Neither of the two filesystems supported by ESP-IDF (spiffs and fatfs) support [symlinks](https://github.com/natevw/fatfs/blob/master/README.md) in the first place
* There is no notion of fs permissions at all, as the ESP-IDF is an embedded platform that does not have the notion of users, groups, etc.
* Similarly, ESP-IDF has just one "process" - the firmware itself - which contains the user code and the "OS" fused together and running with all permissions
Print a helpful message if unwinding aborts when it reaches a nounwind function
This is implemented by routing `TerminatorKind::Abort` back through the panic handler, but with a special flag in the `PanicInfo` which indicates that the panic handler should *not* attempt to unwind the stack and should instead abort immediately.
This is useful for the planned change in https://github.com/rust-lang/lang-team/issues/97 which would make `Drop` impls `nounwind` by default.
### Code
```rust
#![feature(c_unwind)]
fn panic() {
panic!()
}
extern "C" fn nounwind() {
panic();
}
fn main() {
nounwind();
}
```
### Before
```
$ ./test
thread 'main' panicked at 'explicit panic', test.rs:4:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Illegal instruction (core dumped)
```
### After
```
$ ./test
thread 'main' panicked at 'explicit panic', test.rs:4:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread 'main' panicked at 'panic in a function that cannot unwind', test.rs:7:1
stack backtrace:
0: 0x556f8f86ec9b - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::hdccefe11a6ac4396
1: 0x556f8f88ac6c - core::fmt::write::he152b28c41466ebb
2: 0x556f8f85d6e2 - std::io::Write::write_fmt::h0c261480ab86f3d3
3: 0x556f8f8654fa - std::panicking::default_hook::{{closure}}::h5d7346f3ff7f6c1b
4: 0x556f8f86512b - std::panicking::default_hook::hd85803a1376cac7f
5: 0x556f8f865a91 - std::panicking::rust_panic_with_hook::h4dc1c5a3036257ac
6: 0x556f8f86f079 - std::panicking::begin_panic_handler::{{closure}}::hdda1d83c7a9d34d2
7: 0x556f8f86edc4 - std::sys_common::backtrace::__rust_end_short_backtrace::h5b70ed0cce71e95f
8: 0x556f8f865592 - rust_begin_unwind
9: 0x556f8f85a764 - core::panicking::panic_no_unwind::h2606ab3d78c87899
10: 0x556f8f85b910 - test::nounwind::hade6c7ee65050347
11: 0x556f8f85b936 - test::main::hdc6e02cb36343525
12: 0x556f8f85b7e3 - core::ops::function::FnOnce::call_once::h4d02663acfc7597f
13: 0x556f8f85b739 - std::sys_common::backtrace::__rust_begin_short_backtrace::h071d40135adb0101
14: 0x556f8f85c149 - std::rt::lang_start::{{closure}}::h70dbfbf38b685e93
15: 0x556f8f85c791 - std::rt::lang_start_internal::h798f1c0268d525aa
16: 0x556f8f85c131 - std::rt::lang_start::h476a7ee0a0bb663f
17: 0x556f8f85b963 - main
18: 0x7f64c0822b25 - __libc_start_main
19: 0x556f8f85ae8e - _start
20: 0x0 - <unknown>
thread panicked while panicking. aborting.
Aborted (core dumped)
```
readdir() is preferred over readdir_r() on Linux and many other
platforms because it more gracefully supports long file names. Both
glibc and musl (and presumably all other Linux libc implementations)
guarantee that readdir() is thread-safe as long as a single DIR* is not
accessed concurrently, which is enough to make a readdir()-based
implementation of ReadDir safe. This implementation is already used for
some other OSes including Fuchsia, Redox, and Solaris.
See #40021 for more details. Fixes#86649. Fixes#34668.
Modifications to `std::io::Stdin` on Windows so that there is no longer a 4-byte buffer minimum in read().
This is an attempted fix of issue #91722, where a too-small buffer was passed to the read function of stdio on Windows. This caused an error to be returned when `read_to_end` or `read_to_string` were called. Both delegate to `std::io::default_read_to_end`, which creates a buffer that is of length >0, and forwards it to `std::io::Stdin::read()`. The latter method returns an error if the length of the buffer is less than 4, as there might not be enough space to allocate a UTF-16 character. This creates a problem when the buffer length is in `0 < N < 4`, causing the bug.
The current modification creates an internal buffer, much like the one used for the write functions
I'd also like to acknowledge the help of ``@agausmann`` and ``@hkratz`` in detecting and isolating the bug, and for suggestions that made the fix possible.
Couple disclaimers:
- Firstly, I didn't know where to put code to replicate the bug found in the issue. It would probably be wise to add that case to the testing suite, but I'm afraid that I don't know _where_ that test should be added.
- Secondly, the code is fairly fundamental to IO operations, so my fears are that this may cause some undesired side effects ~or performance loss in benchmarks.~ The testing suite runs on my computer, and it does fix the issue noted in #91722.
- Thirdly, I left the "surrogate" field in the Stdin struct, but from a cursory glance, it seems to be serving the same purpose for other functions. Perhaps merging the two would be appropriate.
Finally, this is my first pull request to the rust language, and as such some things may be weird/unidiomatic/plain out bad. If there are any obvious improvements I could do to the code, or any other suggestions, I would appreciate them.
Edit: Closes#91722
Quote bat script command line
Fixes#91991
[`CreateProcessW`](https://docs.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-createprocessw#parameters) should only be used to run exe files but it does have some (undocumented) special handling for files with `.bat` and `.cmd` extensions. Essentially those magic extensions will cause the parameters to be automatically rewritten. Example pseudo Rust code (note that `CreateProcess` starts with an optional application name followed by the application arguments):
```rust
// These arguments...
CreateProcess(None, `@"foo.bat` "hello world""`@,` ...);
// ...are rewritten as
CreateProcess(Some(r"C:\Windows\System32\cmd.exe"), `@""foo.bat` "hello world"""`@,` ...);
```
However, when setting the first parameter (the application name) as we now do, it will omit the extra level of quotes around the arguments:
```rust
// These arguments...
CreateProcess(Some("foo.bat"), `@"foo.bat` "hello world""`@,` ...);
// ...are rewritten as
CreateProcess(Some(r"C:\Windows\System32\cmd.exe"), `@"foo.bat` "hello world""`@,` ...);
```
This means the arguments won't be passed to the script as intended.
Note that running batch files this way is undocumented but people have relied on this so we probably shouldn't break it.
They are also removed from the prelude as per the decision in
https://github.com/rust-lang/rust/issues/87228.
stdarch and compiler-builtins are updated to work with the new, stable
asm! and global_asm! macros.
Implement most of RFC 2930, providing the ReadBuf abstraction
This replaces the `Initializer` abstraction for permitting reading into uninitialized buffers, closing #42788.
This leaves several APIs described in the RFC out of scope for the initial implementation:
* read_buf_vectored
* `ReadBufs`
Closes#42788, by removing the relevant APIs.
Update std::env::temp_dir to use GetTempPath2 on Windows when available.
As a security measure, Windows 11 introduces a new temporary directory API, GetTempPath2.
When the calling process is running as SYSTEM, a separate temporary directory
will be returned inaccessible to non-SYSTEM processes. For non-SYSTEM processes
the behavior will be the same as before.
This can help mitigate against attacks such as this one:
https://medium.com/csis-techblog/cve-2020-1088-yet-another-arbitrary-delete-eop-a00b97d8c3e2
Compatibility risk: Software which relies on temporary files to communicate between SYSTEM and non-SYSTEM
processes may be affected by this change. In many cases, such patterns may be vulnerable to the very
attacks the new API was introduced to harden against.
I'm unclear on the Rust project's tolerance for such change-of-behavior in the standard library. If anything,
this PR is meant to raise awareness of the issue and hopefully start the conversation.
How tested: Taking the example code from the documentation and running it through psexec (from SysInternals) on
Win10 and Win11.
On Win10:
C:\test>psexec -s C:\test\main.exe
<...>
Temporary directory: C:\WINDOWS\TEMP\
On Win11:
C:\test>psexec -s C:\test\main.exe
<...>
Temporary directory: C:\Windows\SystemTemp\
Refactor weak symbols in std::sys::unix
This makes a few changes to the weak symbol macros in `sys::unix`:
- `dlsym!` is added to keep the functionality for runtime `dlsym`
lookups, like for `__pthread_get_minstack@GLIBC_PRIVATE` that we don't
want to show up in ELF symbol tables.
- `weak!` now uses `#[linkage = "extern_weak"]` symbols, so its runtime
behavior is just a simple null check. This is also used by `syscall!`.
- On non-ELF targets (macos/ios) where that linkage is not known to
behave, `weak!` is just an alias to `dlsym!` for the old behavior.
- `raw_syscall!` is added to always call `libc::syscall` on linux and
android, for cases like `clone3` that have no known libc wrapper.
The new `weak!` linkage does mean that you'll get versioned symbols if
you build with a newer glibc, like `WEAK DEFAULT UND statx@GLIBC_2.28`.
This might seem problematic, but old non-weak symbols can tie the build
to new versions too, like `dlsym@GLIBC_2.34` from their recent library
unification. If you build with an old glibc like `dist-x86_64-linux`
does, you'll still get unversioned `WEAK DEFAULT UND statx`, which may
be resolved based on the runtime glibc.
I also found a few functions that don't need to be weak anymore:
- Android can directly use `ftruncate64`, `pread64`, and `pwrite64`, as
these were added in API 12, and our baseline is API 14.
- Linux can directly use `splice`, added way back in glibc 2.5 and
similarly old musl. Android only added it in API 21 though.
According to documentation, the listed errnos should only occur
if the `copy_file_range` call cannot be made at all, so the
assert be correct. However, since in practice file system
drivers (incl. FUSE etc.) can return any errno they want, we
should not panic here.
Fixes#91152
Windows: Resolve `process::Command` program without using the current directory
Currently `std::process::Command` searches many directories for the executable to run, including the current directory. This has lead to a [CVE for `ripgrep`](https://cve.circl.lu/cve/CVE-2021-3013) but presumably other command line utilities could be similarly vulnerable if they run commands. This was [discussed on the internals forum](https://internals.rust-lang.org/t/std-command-resolve-to-avoid-security-issues-on-windows/14800). Also discussed was [which directories should be searched](https://internals.rust-lang.org/t/windows-where-should-command-new-look-for-executables/15015).
EDIT: This PR originally removed all implicit paths. They've now been added back as laid out in the rest of this comment.
## Old Search Strategy
The old search strategy is [documented here][1]. Additionally Rust adds searching the child's paths (see also #37519). So the full list of paths that were searched was:
1. The directories that are listed in the child's `PATH` environment variable.
2. The directory from which the application loaded.
3. The current directory for the parent process.
4. The 32-bit Windows system directory.
5. The 16-bit Windows system directory.
6. The Windows directory.
7. The directories that are listed in the PATH environment variable.
## New Search Strategy
The new strategy removes the current directory from the searched paths.
1. The directories that are listed in the child's PATH environment variable.
2. The directory from which the application loaded.
3. The 32-bit Windows system directory.
4. The Windows directory.
5. The directories that are listed in the parent's PATH environment variable.
Note that it also removes the 16-bit system directory, mostly because there isn't a function to get it. I do not anticipate this being an issue in modern Windows.
## Impact
Removing the current directory should fix CVE's like the one linked above. However, it's possible some Windows users of affected Rust CLI applications have come to expect the old behaviour.
This change could also affect small Windows-only script-like programs that assumed the current directory would be used. The user would need to use `.\file.exe` instead of the bare application name.
This PR could break tests, especially those that test the exact output of error messages (e.g. Cargo) as this does change the error messages is some cases.
[1]: https://docs.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-createprocessa#parameters
std: Get the standard library compiling for wasm64
This commit goes through and updates various `#[cfg]` as appropriate to
get the wasm64-unknown-unknown target behaving similarly to the
wasm32-unknown-unknown target. Most of this is just updating various
conditions for `target_arch = "wasm32"` to also account for `target_arch
= "wasm64"` where appropriate. This commit also lists `wasm64` as an
allow-listed architecture to not have the `restricted_std` feature
enabled, enabling experimentation with `-Z build-std` externally.
The main goal of this commit is to enable playing around with
`wasm64-unknown-unknown` externally via `-Z build-std` in a way that's
similar to the `wasm32-unknown-unknown` target. These targets are
effectively the same and only differ in their pointer size, but wasm64
is much newer and has much less ecosystem/library support so it'll still
take time to get wasm64 fully-fledged.
This makes a few changes to the weak symbol macros in `sys::unix`:
- `dlsym!` is added to keep the functionality for runtime `dlsym`
lookups, like for `__pthread_get_minstack@GLIBC_PRIVATE` that we don't
want to show up in ELF symbol tables.
- `weak!` now uses `#[linkage = "extern_weak"]` symbols, so its runtime
behavior is just a simple null check. This is also used by `syscall!`.
- On non-ELF targets (macos/ios) where that linkage is not known to
behave, `weak!` is just an alias to `dlsym!` for the old behavior.
- `raw_syscall!` is added to always call `libc::syscall` on linux and
android, for cases like `clone3` that have no known libc wrapper.
The new `weak!` linkage does mean that you'll get versioned symbols if
you build with a newer glibc, like `WEAK DEFAULT UND statx@GLIBC_2.28`.
This might seem problematic, but old non-weak symbols can tie the build
to new versions too, like `dlsym@GLIBC_2.34` from their recent library
unification. If you build with an old glibc like `dist-x86_64-linux`
does, you'll still get unversioned `WEAK DEFAULT UND statx`, which may
be resolved based on the runtime glibc.
I also found a few functions that don't need to be weak anymore:
- Android can directly use `ftruncate64`, `pread64`, and `pwrite64`, as
these were added in API 12, and our baseline is API 14.
- Linux can directly use `splice`, added way back in glibc 2.5 and
similarly old musl. Android only added it in API 21 though.
Only use `clone3` when needed for pidfd
In #89522 we learned that `clone3` is interacting poorly with Gentoo's
`sandbox` tool. We only need that for the unstable pidfd extensions, so
otherwise avoid that and use a normal `fork`.
This is a re-application of beta #89924, now that we're aware that we need
more than just a temporary release fix. I also reverted 12fbabd27f, as
that was just fallout from using `clone3` instead of `fork`.
r? `@Mark-Simulacrum`
cc `@joshtriplett`
This commit goes through and updates various `#[cfg]` as appropriate to
get the wasm64-unknown-unknown target behaving similarly to the
wasm32-unknown-unknown target. Most of this is just updating various
conditions for `target_arch = "wasm32"` to also account for `target_arch
= "wasm64"` where appropriate. This commit also lists `wasm64` as an
allow-listed architecture to not have the `restricted_std` feature
enabled, enabling experimentation with `-Z build-std` externally.
The main goal of this commit is to enable playing around with
`wasm64-unknown-unknown` externally via `-Z build-std` in a way that's
similar to the `wasm32-unknown-unknown` target. These targets are
effectively the same and only differ in their pointer size, but wasm64
is much newer and has much less ecosystem/library support so it'll still
take time to get wasm64 fully-fledged.
Make `std:🧵:available_concurrency` support process-limited number of CPUs
Use `libc::sched_getaffinity` and count the number of CPUs in the returned mask. This handles cases where the process doesn't have access to all CPUs, such as when limited via `taskset` or similar.
This also covers cgroup cpusets.
In #89522 we learned that `clone3` is interacting poorly with Gentoo's
`sandbox` tool. We only need that for the unstable pidfd extensions, so
otherwise avoid that and use a normal `fork`.
`#[thread_local]` allows us to maintain a per-thread list of destructors. This also avoids the need to synchronize global data (which is particularly tricky within the TLS callback function).
Use libc::sched_getaffinity and count the number of CPUs in the returned
mask. This handles cases where the process doesn't have access to all
CPUs, such as when limited via taskset or similar.
Automatically convert paths to verbatim for filesystem operations that support it
This allows using longer paths without the user needing to `canonicalize` or manually prefix paths. If the path is already verbatim then this has no effect.
Fixes: #32689
As a security measure, Windows 11 introduces a new temporary directory API, GetTempPath2.
When the calling process is running as SYSTEM, a separate temporary directory
will be returned inaccessible to non-SYSTEM processes. For non-SYSTEM processes
the behavior will be the same as before.
removing TLS support in x86_64-unknown-none-hermitkernel
HermitCore's kernel itself doesn't support TLS. Consequently, the entries in x86_64-unknown-none-hermitkernel should be removed. This commit should help to finalize #89062.
linux/aarch64 Now() should be actually_monotonic()
While issues have been seen on arm64 platforms the Arm architecture requires
that the counter monotonically increases and that it must provide a uniform
view of system time (e.g. it must not be possible for a core to receive a
message from another core with a time stamp and observe time going backwards
(ARM DDI 0487G.b D11.1.2). While there have been a few 64bit SoCs that have
bugs (#49281, #56940) which cause time to not monotonically increase, these have
been fixed in the Linux kernel and we shouldn't penalize all Arm SoCs for those
who refuse to update their kernels:
SUN50I_ERRATUM_UNKNOWN1 - Allwinner A64 / Pine A64 - fixed in 5.1
FSL_ERRATUM_A008585 - Freescale LS2080A/LS1043A - fixed in 4.10
HISILICON_ERRATUM_161010101 - Hisilicon 1610 - fixed in 4.11
ARM64_ERRATUM_858921 - Cortex A73 - fixed in 4.12
255a3f3e18 std: Force `Instant::now()` to be monotonic added a Mutex to work around
this problem and a small test program using glommio shows the majority of time spent
acquiring and releasing this Mutex. 3914a7b0da tries to improve this, but actually
makes it worse on big systems as for 128b atomics a ldxp/stxp pair (and successful loop)
for v8.4 systems that don't support FEAT_LSE2 is required which is expensive as a lock
and because of how the load/store-exclusives scale on large Arm systems is both unfair
to threads and tends to go backwards in performance.
A small sample program using glommio improves by 70x on a 32 core Graviton2
system with this change.
[fuchsia] Update process info struct
The fuchsia platform is in the process of softly transitioning over to
using a new value for ZX_INFO_PROCESS with a new corresponding struct.
This change migrates libstd.
See [fxrev.dev/510478](https://fxrev.dev/510478) and [fxbug.dev/30751](https://fxbug.dev/30751) for more detail.
Use BCryptGenRandom instead of RtlGenRandom on Windows.
This removes usage of RtlGenRandom on Windows, in favour of BCryptGenRandom.
BCryptGenRandom isn't available on XP, but we dropped XP support a while ago.
The fuchsia platform is in the process of softly transitioning over to
using a new value for ZX_INFO_PROCESS with a new corresponding struct.
This change migrates libstd.
See fxrev.dev/510478 and fxbug.dev/30751 for more detail.
Fix ctrl-c causing reads of stdin to return empty on Windows.
Pressing ctrl+c (or ctrl+break) on Windows caused a blocking read of stdin to unblock and return empty, unlike other platforms which continue to block.
On ctrl-c, `ReadConsoleW` will return success, but also set `LastError` to `ERROR_OPERATION_ABORTED`.
This change detects this case, and re-tries the call to `ReadConsoleW`.
Fixes#89177. See issue for further details.
Tested on Windows 7 and Windows 10 with both MSVC and GNU toolchains
Add new tier-3 target: armv7-unknown-linux-uclibceabihf
This change adds a new tier-3 target: armv7-unknown-linux-uclibceabihf
This target is primarily used in embedded linux devices where system resources are slim and glibc is deemed too heavyweight. Cross compilation C toolchains are available [here](https://toolchains.bootlin.com/) or via [buildroot](https://buildroot.org).
The change is based largely on a previous PR #79380 with a few minor modifications. The author of that PR was unable to push the PR forward, and graciously allowed me to take it over.
Per the [target tier 3 policy](https://github.com/rust-lang/rfcs/blob/master/text/2803-target-tier-policy.md), I volunteer to be the "target maintainer".
This is my first PR to Rust itself, so I apologize if I've missed things!
Rename `std:🧵:available_conccurrency` to `std:🧵:available_parallelism`
_Tracking issue: https://github.com/rust-lang/rust/issues/74479_
This PR renames `std:🧵:available_conccurrency` to `std:🧵:available_parallelism`.
## Rationale
The API was initially named `std:🧵:hardware_concurrency`, mirroring the [C++ API of the same name](https://en.cppreference.com/w/cpp/thread/thread/hardware_concurrency). We eventually decided to omit any reference to the word "hardware" after [this comment](https://github.com/rust-lang/rust/pull/74480#issuecomment-662045841). And so we ended up with `available_concurrency` instead.
---
For a talk I was preparing this week I was reading through ["Understanding and expressing scalable concurrency" (A. Turon, 2013)](http://aturon.github.io/academic/turon-thesis.pdf), and the following passage stood out to me (emphasis mine):
> __Concurrency is a system-structuring mechanism.__ An interactive system that deals with disparate asynchronous events is naturally structured by division into concurrent threads with disparate responsibilities. Doing so creates a better fit between problem and solution, and can also decrease the average latency of the system by preventing long-running computations from obstructing quicker ones.
> __Parallelism is a resource.__ A given machine provides a certain capacity for parallelism, i.e., a bound on the number of computations it can perform simultaneously. The goal is to maximize throughput by intelligently using this resource. For interactive systems, parallelism can decrease latency as well.
_Chapter 2.1: Concurrency is not Parallelism. Page 30._
---
_"Concurrency is a system-structuring mechanism. Parallelism is a resource."_ — It feels like this accurately captures the way we should be thinking about these APIs. What this API returns is not "the amount of concurrency available to the program" which is a property of the program, and thus even with just a single thread is effectively unbounded. But instead it returns "the amount of _parallelism_ available to the program", which is a resource hard-constrained by the machine's capacity (and can be further restricted by e.g. operating systems).
That's why I'd like to propose we rename this API from `available_concurrency` to `available_parallelism`. This still meets the criteria we previously established of not attempting to define what exactly we mean by "hardware", "threads", and other such words. Instead we only talk about "concurrency" as an abstract resource available to our program.
r? `@joshtriplett`
Manual Debug for Unix ExitCode ExitStatus ExitStatusError
These structs have misleading names. An ExitStatus[Error] is actually a Unix wait status; an ExitCode is actually an exit status. These misleading names appear in the `Debug` output.
The `Display` impls on Unix have been improved, but the `Debug` impls are still misleading, as reported in #74832.
Fix this by pretending that these internal structs are called `unix_exit_status` and `unix_wait_status` as applicable. (We can't actually rename the structs because of the way that the cross-platform machinery works: the names are cross-platform.)
After this change, this program
```
#![feature(exit_status_error)]
fn main(){
let x = std::process::Command::new("false").status().unwrap();
dbg!(x.exit_ok());
eprintln!("x={:?}",x);
}
```
produces this output
```
[src/main.rs:4] x.exit_ok() = Err(
ExitStatusError(
unix_wait_status(
256,
),
),
)
x=ExitStatus(unix_wait_status(256))
```
Closes#74832
Remove unnecessary unsafe block in `process_unix`
Because it's nested under this unsafe fn!
This block isn't detected as unnecessary because of a bug in the compiler: #88260.
This allows using longer paths for filesystem operations without the user needing to `canonicalize` or manually prefix paths.
If the path is already verbatim than this has no effect.
On MinGW toolchains the various features (such as function sections)
necessary to eliminate dead function references are disabled due to
various bugs. This means that the windows sockets library will most
likely remain linked to any mingw toolchain built program that also
utilizes libstd.
That said, I made an attempt to also enable `function-sections` and
`--gc-sections` during my experiments, but the symbol references
remained, sadly.
Restructure std::rt
These changes should reduce binary size slightly while at the same slightly improving performance of startup, thread spawning and `std:🧵:current()`. I haven't verified if the compiler is able to optimize some of these cases already, but at least for some others the compiler is unable to do these optimizations as they slightly change behavior in cases where program startup would crash anyway by omitting a backtrace and panic location.
I can remove 6f6bb16 if preferred.
The reference automatically coerces to a pointer. Writing an explicit
cast here is slightly misleading because that's most commonly used when
a pointer needs to be converted from one pointer type to another, e.g.
`*const c_void` to `*const sigaction` or vice versa.
SOLID[1] is an embedded development platform provided by Kyoto
Microcomputer Co., Ltd. This commit introduces a basic Tier 3 support
for SOLID.
# New Targets
The following targets are added:
- `aarch64-kmc-solid_asp3`
- `armv7a-kmc-solid_asp3-eabi`
- `armv7a-kmc-solid_asp3-eabihf`
SOLID's target software system can be divided into two parts: an
RTOS kernel, which is responsible for threading and synchronization,
and Core Services, which provides filesystems, networking, and other
things. The RTOS kernel is a μITRON4.0[2][3]-derived kernel based on
the open-source TOPPERS RTOS kernels[4]. For uniprocessor systems
(more precisely, systems where only one processor core is allocated for
SOLID), this will be the TOPPERS/ASP3 kernel. As μITRON is
traditionally only specified at the source-code level, the ABI is
unique to each implementation, which is why `asp3` is included in the
target names.
More targets could be added later, as we support other base kernels
(there are at least three at the point of writing) and are interested
in supporting other processor architectures in the future.
# C Compiler
Although SOLID provides its own supported C/C++ build toolchain, GNU Arm
Embedded Toolchain seems to work for the purpose of building Rust.
# Unresolved Questions
A μITRON4 kernel can support `Thread::unpark` natively, but it's not
used by this commit's implementation because the underlying kernel
feature is also used to implement `Condvar`, and it's unclear whether
`std` should guarantee that parking tokens are not clobbered by other
synchronization primitives.
# Unsupported or Unimplemented Features
Most features are implemented. The following features are not
implemented due to the lack of native support:
- `fs::File::{file_attr, truncate, duplicate, set_permissions}`
- `fs::{symlink, link, canonicalize}`
- Process creation
- Command-line arguments
Backtrace generation is not really a good fit for embedded targets, so
it's intentionally left unimplemented. Unwinding is functional, however.
## Dynamic Linking
Dynamic linking is not supported. The target platform supports dynamic
linking, but enabling this in Rust causes several problems.
- The linker invocation used to build the shared object of `std` is
too long for the platform-provided linker to handle.
- A linker script with specific requirements is required for the
compiled shared object to be actually loadable.
As such, we decided to disable dynamic linking for now. Regardless, the
users can try to create shared objects by manually invoking the linker.
## Executable
Building an executable is not supported as the notion of "executable
files" isn't well-defined for these targets.
[1] https://solid.kmckk.com/SOLID/
[2] http://ertl.jp/ITRON/SPEC/mitron4-e.html
[3] https://en.wikipedia.org/wiki/ITRON_project
[4] https://toppers.jp/
Fix WinUWP std compilation errors due to I/O safety
I/O safety for Windows has landed in #87329. However, it does not cover UWP specific parts and prevents all UWP targets from building. See https://github.com/YtFlow/Maple/issues/18. This PR fixes these compile errors when building std for UWP targets.
This is a straightforward wrapper that uses the existing helpers for C
string handling and errno handling.
Having this available is convenient for UNIX utility programs written in
Rust, and avoids having to call unsafe functions like `libc::chown`
directly and handle errors manually, in a program that may otherwise be
entirely safe code.
In addition, these functions provide a more Rustic interface by
accepting appropriate traits and using `None` rather than `-1`.
While issues have been seen on arm64 platforms the Arm architecture requires
that the counter monotonically increases and that it must provide a uniform
view of system time (e.g. it must not be possible for a core to receive a
message from another core with a time stamp and observe time going backwards
(ARM DDI 0487G.b D11.1.2). While there have been a few 64bit SoCs that have
bugs (#49281, #56940) which cause time to not monotonically increase, these have
been fixed in the Linux kernel and we shouldn't penalize all Arm SoCs for those
who refuse to update their kernels:
SUN50I_ERRATUM_UNKNOWN1 - Allwinner A64 / Pine A64 - fixed in 5.1
FSL_ERRATUM_A008585 - Freescale LS2080A/LS1043A - fixed in 4.10
HISILICON_ERRATUM_161010101 - Hisilicon 1610 - fixed in 4.11
ARM64_ERRATUM_858921 - Cortex A73 - fixed in 4.12
255a3f3e18 std: Force `Instant::now()` to be monotonic added a mutex to work around
this problem and a small test program using glommio shows the majority of time spent
acquiring and releasing this Mutex. 3914a7b0da tries to improve this, but actually
makes it worse on big systems as for 128b atomics a ldxp/stxp pair (and
successful loop) is required which is expensive as a lock and because of how
the load/store-exclusives scale on large Arm systems is both unfair to threads
and tends to go backwards in performance.
Allow writing of incomplete UTF-8 sequences to the Windows console via stdout/stderr
# Problem
Writes of just an incomplete UTF-8 byte sequence (e.g. `b"\xC3"` or `b"\xF0\x9F"`) to stdout/stderr with a Windows console attached error with `io::ErrorKind::InvalidData, "Windows stdio in console mode does not support writing non-UTF-8 byte sequences"` even though further writes could complete the codepoint. This is currently a rare occurence since the [linewritershim](2c56ea38b0/library/std/src/io/buffered/linewritershim.rs) implementation flushes complete lines immediately and buffers up to 1024 bytes for incomplete lines. It can still happen as described in #83258.
The problem will become more pronounced once the developer can switch stdout/stderr from line-buffered to block-buffered or immediate when the changes in the "Switchable buffering for Stdout" pull request (#78515) get merged.
# Patch description
If there is at least one valid UTF-8 codepoint all valid UTF-8 is passed through to the extracted `write_valid_utf8_to_console()` fn. The new code only comes into play if `write()` is being passed a short byte slice comprising an incomplete UTF-8 codepoint. In this case up to three bytes are buffered in the `IncompleteUtf8` struct associated with `Stdout` / `Stderr`. The bytes are accepted one at a time. As soon as an error can be detected `io::ErrorKind::InvalidData, "Windows stdio in console mode does not support writing non-UTF-8 byte sequences"` is returned. Once a complete UTF-8 codepoint is received it is passed to the `write_valid_utf8_to_console()` and the buffer length is set to zero.
Calling `flush()` will neither error nor write anything if an incomplete codepoint is present in the buffer.
# Tests
Currently there are no Windows-specific tests for console writing code at all. Writing (regression) tests for this problem is a bit challenging since unit tests and UI tests don't run in a console and suddenly popping up another console window might be surprising to developers running the testsuite and it might not work at all in CI builds. To just test the new functionality in unit tests the code would need to be refactored. Some guidance on how to proceed would be appreciated.
# Public API changes
* `std::str::verifications::utf8_char_width()` would be exposed as `std::str::utf8_char_width()` behind the "str_internals" feature gate.
# Related issues
* Fixes#83258.
* PR #78515 will exacerbate the problem.
# Open questions
* Add tests?
* Squash into one commit with better commit message?
Use the return value of readdir_r() instead of errno
POSIX says:
> If successful, the readdir_r() function shall return zero; otherwise,
> an error number shall be returned to indicate the error.
But we were previously using errno instead of the return value. This
led to issue #86649.
POSIX says:
> If successful, the readdir_r() function shall return zero; otherwise,
> an error number shall be returned to indicate the error.
But we were previously using errno instead of the return value. This
led to issue #86649.
These structs have misleading names. An ExitStatus[Error] is actually
a Unix wait status; an ExitCode is actually an exit status.
The Display impls are fixed, but the Debug impls are still misleading,
as reported in #74832.
Fix this by pretending that these internal structs are called
`unix_exit_status` and `unix_wait_status` as applicable. (We can't
actually rename the structs because of the way that the cross-platform
machinery works: the names are cross-platform.)
Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
Unbox mutexes, condvars and rwlocks on hermit
[RustyHermit](https://github.com/hermitcore/rusty-hermit) provides now movable synchronization primitives and we are able to unbox mutexes and condvars.
Change WASI's `RawFd` from `u32` to `c_int` (`i32`).
WASI previously used `u32` as its `RawFd` type, since its "file descriptors"
are unsigned table indices, and there's no fundamental reason why WASI can't
have more than 2^31 handles.
However, this creates myriad little incompability problems with code
that also supports Unix platforms, where `RawFd` is `c_int`. While WASI
isn't a Unix, it often shares code with Unix, and this difference made
such shared code inconvenient. #87329 is the most recent example of such
code.
So, switch WASI to use `c_int`, which is `i32`. This will mean that code
intending to support WASI should ideally avoid assuming that negative file
descriptors are invalid, even though POSIX itself says that file descriptors
are never negative.
This is a breaking change, but `RawFd` is considerd an experimental
feature in [the documentation].
[the documentation]: https://doc.rust-lang.org/stable/std/os/wasi/io/type.RawFd.html
r? `@alexcrichton`
WASI previously used `u32` as its `RawFd` type, since its "file descriptors"
are unsigned table indices, and there's no fundamental reason why WASI can't
have more than 2^31 handles.
However, this creates myriad little incompability problems with code
that also supports Unix platforms, where `RawFd` is `c_int`. While WASI
isn't a Unix, it often shares code with Unix, and this difference made
such shared code inconvenient. #87329 is the most recent example of such
code.
So, switch WASI to use `c_int`, which is `i32`. This will mean that code
intending to support WASI should ideally avoid assuming that negative file
descriptors are invalid, even though POSIX itself says that file descriptors
are never negative.
This is a breaking change, but `RawFd` is considerd an experimental
feature in [the documentation].
[the documentation]: https://doc.rust-lang.org/stable/std/os/wasi/io/type.RawFd.html
Move `os_str_bytes` to `sys::unix`
Followup to #84967, with `OsStrExt` and `OsStringExt` moved out of `sys_common`, there is no reason anymore for `os_str_bytes` to live in `sys_common` and not in sys. This pr moves it to the location `sys::unix::os_str` and reuses the code on other platforms via `#[path]` (as is common in `sys`) instead of importing.
Change environment variable getters to error recoverably
This PR changes the standard library environment variable getter functions to error recoverably (i.e. not panic) when given an invalid value.
On some platforms, it is invalid for environment variable names to contain `'\0'` or `'='`, or for their values to contain `'\0'`. Currently, the standard library panics when manipulating environment variables with names or values that violate these invariants. However, this behavior doesn't make a lot of sense, at least in the case of getters. If the environment variable is missing, the standard library just returns an error value, rather than panicking. It doesn't make sense to treat the case where the variable is invalid any differently from that. See the [internals thread](https://internals.rust-lang.org/t/why-should-std-var-panic/14847) for discussion. Thus, this PR changes the functions to error recoverably in this case as well.
If desired, I could change the functions that manipulate environment variables in other ways as well. I didn't do that here because it wasn't entirely clear what to change them to. Should they error silently or do something else? If someone tells me how to change them, I'm happy to implement the changes.
This fixes#86082, an ICE that arises from the current behavior. It also adds a regression test to make sure the ICE does not occur again in the future.
`@rustbot` label +T-libs
r? `@joshtriplett`
Bump bootstrap compiler to 1.55
Changing the cfgs for stdarch is missing, but my understanding is that we don't need to do it as part of this PR?
r? `@Mark-Simulacrum`
Add Linux-specific pidfd process extensions (take 2)
Continuation of #77168.
I addressed the following concerns from the original PR:
- make `CommandExt` and `ChildExt` sealed traits
- wrap file descriptors in `PidFd` struct representing ownership over the fd
- add `take_pidfd` to take the fd out of `Child`
- close fd when dropped
Tracking Issue: #82971
Update VxWork's UNIX support
1. VxWorks does not provide glibc
2. VxWorks does provide `sigemptyset` and `sigaddset`
Note: these changes are concurrent to [this PR](https://github.com/rust-lang/libc/pull/2295) in libc.
Background:
Over the last year, pidfd support was added to the Linux kernel. This
allows interacting with other processes. In particular, this allows
waiting on a child process with a timeout in a race-free way, bypassing
all of the awful signal-handler tricks that are usually required.
Pidfds can be obtained for a child process (as well as any other
process) via the `pidfd_open` syscall. Unfortunately, this requires
several conditions to hold in order to be race-free (i.e. the pid is not
reused).
Per `man pidfd_open`:
```
· the disposition of SIGCHLD has not been explicitly set to SIG_IGN
(see sigaction(2));
· the SA_NOCLDWAIT flag was not specified while establishing a han‐
dler for SIGCHLD or while setting the disposition of that signal to
SIG_DFL (see sigaction(2)); and
· the zombie process was not reaped elsewhere in the program (e.g.,
either by an asynchronously executed signal handler or by wait(2)
or similar in another thread).
If any of these conditions does not hold, then the child process
(along with a PID file descriptor that refers to it) should instead
be created using clone(2) with the CLONE_PIDFD flag.
```
Sadly, these conditions are impossible to guarantee once any libraries
are used. For example, C code runnng in a different thread could call
`wait()`, which is impossible to detect from Rust code trying to open a
pidfd.
While pid reuse issues should (hopefully) be rare in practice, we can do
better. By passing the `CLONE_PIDFD` flag to `clone()` or `clone3()`, we
can obtain a pidfd for the child process in a guaranteed race-free
manner.
This PR:
This PR adds Linux-specific process extension methods to allow obtaining
pidfds for processes spawned via the standard `Command` API. Other than
being made available to user code, the standard library does not make
use of these pidfds in any way. In particular, the implementation of
`Child::wait` is completely unchanged.
Two Linux-specific helper methods are added: `CommandExt::create_pidfd`
and `ChildExt::pidfd`. These methods are intended to serve as a building
block for libraries to build higher-level abstractions - in particular,
waiting on a process with a timeout.
I've included a basic test, which verifies that pidfds are created iff
the `create_pidfd` method is used. This test is somewhat special - it
should always succeed on systems with the `clone3` system call
available, and always fail on systems without `clone3` available. I'm
not sure how to best ensure this programatically.
This PR relies on the newer `clone3` system call to pass the `CLONE_FD`,
rather than the older `clone` system call. `clone3` was added to Linux
in the same release as pidfds, so this shouldn't unnecessarily limit the
kernel versions that this code supports.
Unresolved questions:
* What should the name of the feature gate be for these newly added
methods?
* Should the `pidfd` method distinguish between an error occurring
and `create_pidfd` not being called?
Following up on #87236, add comments to the unix command-line argument
support explaining that the code doesn't mutate the system-provided
argc/argv, and that this is why the code doesn't need a lock or special
memory ordering.
In the command-line argument initialization code, remove the Mutex
around the `ARGV` and `ARGC` variables, and simply check whether
ARGV is non-null before dereferencing it. This way, if either of
ARGV or ARGC is not initialized, we'll get an empty argument list.
This allows simple cdylibs to avoid having
`pthread_mutex_lock`/`pthread_mutex_unlock` appear in their symbol
tables if they don't otherwise use threads.
`weak!` is needed in a test in another module. With macros
1.0, importing `weak!` would require reordering module
declarations in `std/src/lib.rs`, which is a bit too
evil.
aborts: Clarify documentation and comments
In the docs for intrinsics::abort():
* Strengthen the recommendation by to use process::abort instead.
* Document the fact that it sometimes (ab)uses an LLVM debug trap and what the likely consequences are.
* State that the precise behaviour is unstable.
In the docs for process::abort():
* Promise that we have the same behaviour as C `abort()`.
* Document the likely consequences, including, specifically, the consequences on Unix.
In the internal comment for unix::abort_internal:
* Refer to the public docs for the public API functions.
* Correct and expand the description of libc::abort. Specifically:
* Do not claim that abort() unregisters signal handlers. It doesn't; it honours the SIGABRT handler.
* Discuss, extensively, the issue with abort() flushing stdio buffers.
* Describe the glibc behaviour in some detail.
Co-authored-by: Mark Wooding <mdw@distorted.org.uk>
Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
Fixes#40230
Add std::os::unix::fs::DirEntryExt2::file_name_ref(&self) -> &OsStr
Greetings!
This is my first PR here, so please forgive me if I've missed an important step or otherwise done something wrong. I'm very open to suggestions/fixes/corrections.
This PR adds a function that allows `std::fs::DirEntry` to vend a borrow of its filename on Unix platforms, which is especially useful for sorting. (Windows has (as I understand it) encoding differences that require an allocation.) This new function sits alongside the cross-platform [`file_name(&self) -> OsString`](https://doc.rust-lang.org/std/fs/struct.DirEntry.html#method.file_name) function.
I pitched this idea in an [internals thread](https://internals.rust-lang.org/t/allow-std-direntry-to-vend-borrows-of-its-filename/14328/4), and no one objected vehemently, so here we are.
I understand features in general, I believe, but I'm not at all confident that my whole-cloth invention of a new feature string (as required by the compiler) was correct (or that the name is appropriate). Further, there doesn't appear to be a test for the sibling `ino` function, so I didn't add one for this similarly trivial function either. If it's desirable that I should do so, I'd be happy to [figure out how to] do that.
The following is a trivial sample of a use-case for this function, in which directory entries are sorted without any additional allocations:
```rust
use std::os::unix::fs::DirEntryExt;
use std::{fs, io};
fn main() -> io::Result<()> {
let mut entries = fs::read_dir(".")?.collect::<Result<Vec<_>, io::Error>>()?;
entries.sort_unstable_by(|a, b| a.file_name_ref().cmp(b.file_name_ref()));
for p in entries {
println!("{:?}", p);
}
Ok(())
}
```
In the docs for intrinsics::abort():
* Strengthen the recommendation by to use process::abort instead.
* Document the fact that it (ab)uses an LLVM debug trap and what the
likely consequences are.
* State that the precise behaviour is unstable.
In the docs for process::abort():
* Promise that we have the same behaviour as C `abort()`.
* Document the likely consequences, including, specifically, the
consequences on Unix.
In the internal comment for unix::abort_internal:
* Refer to the public docs for the public API functions.
* Correct and expand the description of libc::abort. Specifically:
* Do not claim that abort() unregisters signal handlers. It doesn't;
it honours the SIGABRT handler.
* Discuss, extensively, the issue with abort() flushing stdio buffers.
* Describe the glibc behaviour in some detail.
Co-authored-by: Mark Wooding <mdw@distorted.org.uk>
Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
When using `process::Command` on Windows, environment variable names must be case-preserving but case-insensitive
When using `Command` to set the environment variables, the key should be compared as uppercase Unicode but when set it should preserve the original case.
Fixes#85242
More ErrorKinds for common errnos
From the commit message of the main commit here (as revised):
```
There are a number of IO error situations which it would be very
useful for Rust code to be able to recognise without having to resort
to OS-specific code. Taking some Unix examples, `ENOTEMPTY` and
`EXDEV` have obvious recovery strategies. Recently I was surprised to
discover that `ENOSPC` came out as `ErrorKind::Other`.
Since I am familiar with Unix I reviwed the list of errno values in
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/errno.h.html
Here, I add those that most clearly seem to be needed.
`@CraftSpider` provided information about Windows, and references, which
I have tried to take into account.
This has to be insta-stable because we can't sensibly have a different
set of ErrorKinds depending on a std feature flag.
I have *not* added these to the mapping tables for any operating
systems other than Unix and Windows. I hope that it is OK to add them
now for Unix and Windows now, and maybe add them to other OS's mapping
tables as and when someone on that OS is able to consider the
situation.
I adopted the general principle that it was usually a bad idea to map
two distinct error values to the same Rust error code. I notice that
this principle is already violated in the case of `EACCES` and
`EPERM`, which both map to `PermissionDenied`. I think this was
probably a mistake but it would be quite hard to change now, so I
don't propose to do anything about that.
However, for Windows, there are sometimes different error codes for
identical situations. Eg there are WSA* versions of some error
codes as well as ERROR_* ones. Also Windows seems to have a great
many more erorr codes. I don't know precisely what best practice
would be for Windows.
```
<strike>
```
Errno values I wasn't sure about so *haven't* included:
EMFILE ENFILE ENOBUFS ENOLCK:
These are all fairly Unix-specific resource exhaustion situations.
In practice it seemed not very likely to me that anyone would want
to handle these differently to `Other`.
ENOMEM ERANGE EDOM EOVERFLOW
Normally these don't get exposed to the Rust callers I hope. They
don't tend to come out of filesystem APIs.
EILSEQ
Hopefully Rust libraries open files in binary mode and do the
converstion in Rust. So Rust code ought not to be exposed to
EILSEQ.
EIO
The range of things that could cause this is troublesome. I found
it difficult to describe. I do think it would be useful to add this
at some point, because EIO on a filesystem operation is much more
serious than most other errors.
ENETDOWN
I wasn't sure if this was useful or, indeed, if any modern systems
use it.
ENOEXEC
It is not clear to me how a Rust program could respond to this. It
seems rather niche.
EPROTO ENETRESET ENODATA ENOMSG ENOPROTOOPT ENOSR ENOSTR ETIME
ENOTRECOVERABLE EOWNERDEAD EBADMSG EPROTONOSUPPORT EPROTOTYPE EIDRM
These are network or STREAMS related errors which I have never in
my own Unix programming found the need to do anything with. I think
someone who understands these better should be the one to try to
find good Rust names and descriptions for them.
ENOTTY ENXIO ENODEV EOPNOTSUPP ESRCH EALREADY ECANCELED ECHILD
EINPROGRESS
These are very hard to get unless you're already doing something
very Unix-specific, in which case the raw_os_error interface is
probably more suitable than relying on the Rust ErrorKind mapping.
EFAULT EBADF
These would seem to be the result of application UB.
```
</strike>
<i>(omitted errnos are discussed below, especially in https://github.com/rust-lang/rust/pull/79965#issuecomment-810468334)
Fix double import in wasm thread
The `unsupported` type is imported two times, as `super::unsupported` and as `crate::sys::unsupported`, throwing an error. Remove `super::unsupported` in favor of the other.
As reported in #86802.
Fix#86802
The `unsupported` type is imported two times, as `super::unsupported` and as `crate::sys::unsupported`, throwing an error. Remove `super::unsupported` in favor of the other.
Redefine `ErrorKind::Other` and stop using it in std.
This implements the idea I shared yesterday in the libs meeting when we were discussing how to handle adding new `ErrorKind`s to the standard library: This redefines `Other` to be for *user defined errors only*, and changes all uses of `Other` in the standard library to a `#[doc(hidden)]` and permanently `#[unstable]` `ErrorKind` that users can not match on. This ensures that adding `ErrorKind`s at a later point in time is not a breaking change, since the user couldn't match on these errors anyway. This way, we use the `#[non_exhaustive]` property of the enum in a more effective way.
Open questions:
- How do we check this change doesn't cause too much breakage? Will a crate run help and be enough?
- How do we ensure we don't accidentally start using `Other` again in the standard library? We don't have a `pub(not crate)` or `#[deprecated(in this crate only)]`.
cc https://github.com/rust-lang/rust/pull/79965
cc `@rust-lang/libs` `@ijackson`
r? `@dtolnay`
Use HTTPS links where possible
While looking at #86583, I wondered how many other (insecure) HTTP links were in `rustc`. This changes most other `http` links to `https`. While most of the links are in comments or documentation, there are a few other HTTP links that are used by CI that are changed to HTTPS.
Notes:
- I didn't change any to or in licences
- Some links don't support HTTPS :(
- Some `http` links were dead, in those cases I upgraded them to their new places (all of which used HTTPS)
Move `available_concurrency` implementation to `sys`
This splits out the platform-specific implementation of `available_concurrency` to the corresponding platforms under `sys`. No changes are made to the implementation.
Tidy didn't lint against this code being originally added outside of `sys` because of a bug (see #84677), this PR also reverts the exclusion that was introduced in that bugfix.
Tracking issue of `available_concurrency`: #74479
Dump mingw-64's error codes into our source tree.
I have verified with these runes:
$ f=library/std/src/sys/windows/c/errors.rs
$ diff -ub <(git-cat-file blob HEAD~:$f | sort) <(cat $f | perl -pe 's/WSABASEERR \+ (\d+)/10000 + $1/e' |sort) |grep ^- |less
that this does not change any existing values.
Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
We're going to add many more of these.
This commit is pure code motion, plus the necessary administrivia, as
I have veried with the following runes:
$ git-diff HEAD~ | grep '^+' |sort >plus
$ git-diff HEAD~ | grep '^-' | perl -pe 's/^-/+/' |sort >min
$ diff -ub min plus |less
The output is precisely the expected `mod` and `use` directives.
Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
DWORD is a type alias for u32, so this makes no difference.
But this entry is anomalous and in my forthcoming commits I am going
to import many errors wholesale, and I spotted that my wholesale
import didn't match what was here.
CC: Chris Denton <christophersdenton@gmail.com>
Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
use ErrorKind::*;
I don't feel confident enough about Windows things to reorder this
alphabetically
Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
Multiple improvements to RwLocks
This PR replicates #77147, #77380 and #84650 on RWLocks :
- Split `sys_common::RWLock` in `StaticRWLock` and `MovableRWLock`
- Unbox rwlocks on some platforms (Windows, Wasm and unsupported)
- Simplify `RwLock::into_inner`
Notes to reviewers :
- For each target, I copied `MovableMutex` to guess if `MovableRWLock` should be boxed.
- ~A comment says that `StaticMutex` is not re-entrant, I don't understand why and I don't know whether it applies to `StaticRWLock`.~
r? `@m-ou-se`
Since android ndk version `r23-beta3`, `libgcc` has been replaced with
`libunwind`. This moves the linking of `libgcc`/`libunwind` into the
`unwind` crate where we check if the system compiler can find
`libunwind` and fall back to `libgcc` if needed.
- Split `sys_common::RWLock` between `StaticRWLock` and `MovableRWLock`
- Unbox `RwLock` on some platforms (Windows, Wasm and unsupported)
- Simplify `RwLock::into_inner`
Fix `vxworks`
Some PRs made the `vxworks` target not build anymore. This PR fixes that:
- #82973: copy `ExitStatusError` implementation from `unix`.
- #84716: no `libc::chroot` available on `vxworks`, so for now don't implement `os::unix::fs::chroot`.
MSVC: Avoid using jmp stubs for dll function imports
Windows import libraries contain two symbols for every function: `__imp_FunctionName` and `FunctionName` (where `FunctionName` is the name of the function to be imported).
`__imp_FunctionName` contains the address of the imported function. This will be filled in by the Windows executable loader at runtime. `FunctionName` contains a jmp stub that simply jumps to the address given by `__imp_FunctionName`. E.g. it's a function that solely contains a single jmp instruction:
```asm
jmp __imp_FunctionName
```
When using an external DLL function in Rust, by default the linker will link to FunctionName, causing a bit of indirection at runtime. In Microsoft's C++ it's possible to instead tell it to insert calls to the address in `__imp_FunctionName` by using the `__declspec(dllimport)` attribute. In Rust it's possible to get effectively the same behaviour using the `#[link]` attribute on `extern` blocks.
----
The second commit also merges multiple `extern` blocks into one block. This is because otherwise Rust will currently create duplicate linker arguments for each block. In this case having duplicates shouldn't matter much other than the noise when displaying the linker command.
Windows implementation of feature `path_try_exists`
Draft of a Windows implementation of `try_exists` (#83186).
The first commit reorganizes the code so I would be interested to get some feedback on if this is a good idea or not. It moves the `Path::try_exists` function to `fs::exists`. leaving the former as a wrapper for the latter. This makes it easier to provide platform specific implementations and matches the `fs::metadata` function.
The other commit implements a Windows specific variant of `exists`. I'm still figuring out my approach so this is very much a first draft. Eventually this will need some more eyes from knowledgable Windows people.
Move `std::memchr` to `sys_common`
`std::memchr` is a thin abstraction over the different `memchr` implementations in `sys`, along with documentation and tests. The module is only used internally by `std`, nothing is exported externally. Code like this is exactly what the `sys_common` module is for, so this PR moves it there.
Provide ExitStatusError
Closes#73125
In MR #81452 "Add #[must_use] to [...] process::ExitStatus" we concluded that the existing arrangements in are too awkward so adding that `#[must_use]` is blocked on improving the ergonomics.
I wrote a mini-RFC-style discusion of the approach in https://github.com/rust-lang/rust/issues/73125#issuecomment-771092741
Do not allocate or unwind after fork
### Objective scenarios
* Make (simple) panics safe in `Command::pre_exec_hook`, including most `panic!` calls, `Option::unwrap`, and array bounds check failures.
* Make it possible to `libc::fork` and then safely panic in the child (needed for the above, but this requirement means exposing the new raw hook API which the `Command` implementation needs).
* In singlethreaded programs, where panic in `pre_exec_hook` is already memory-safe, prevent the double-unwinding malfunction #79740.
I think we want to make panic after fork safe even though the post-fork child environment is only experienced by users of `unsafe`, beause the subset of Rust in which any panic is UB is really far too hazardous and unnatural.
#### Approach
* Provide a way for a program to, at runtime, switch to having panics abort. This makes it possible to panic without making *any* heap allocations, which is needed because on some platforms malloc is UB in a child forked from a multithreaded program (see https://github.com/rust-lang/rust/pull/80263#issuecomment-774272370, and maybe also the SuS [spec](https://pubs.opengroup.org/onlinepubs/9699919799/functions/fork.html)).
* Make that change in the child spawned by `Command`.
* Document the rules comprehensively enough that a programmer has a fighting chance of writing correct code.
* Test that this all works as expected (and in particular, that there aren't any heap allocations we missed)
Fixes#79740
#### Rejected (or previously attempted) approaches
* Change the panic machinery to be able to unwind without allocating, at least when the payload and message are both `'static`. This seems like it would be even more subtle. Also that is a potentially-hot path which I don't want to mess with.
* Change the existing panic hook mechanism to not convert the message to a `String` before calling the hook. This would be a surprising change for existing code and would not be detected by the type system.
* Provide a `raw_panic_hook` function to intercept panics in a way that doesn't allocate. (That was an earlier version of this MR.)
### History
This MR could be considered a v2 of #80263. Thanks to everyone who commented there. In particular, thanks to `@m-ou-se,` `@Mark-Simulacrum` and `@hyd-dev.` (Tagging you since I think you might be interested in this new MR.) Compared to #80263, this MR has very substantial changes and additions.
Additionally, I have recently (2021-04-20) completely revised this series following very helpful comments from `@m-ou-se.`
r? `@m-ou-se`
Some platforma (eg ARM64) apparently generate SIGTRAP for panic abort!
See eg
https://github.com/rust-lang/rust/pull/81858#issuecomment-840702765
This is probably a bug, but we don't want to entangle this MR with it.
When it's fixed, this commit should be reverted.
Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
Closes#73125
This is in pursuance of
Issue #73127 Consider adding #[must_use] to std::process::ExitStatus
In
MR #81452 Add #[must_use] to [...] process::ExitStatus
we concluded that the existing arrangements in are too awkward
so adding that #[must_use] is blocked on improving the ergonomics.
I wrote a mini-RFC-style discusion of the approach in
https://github.com/rust-lang/rust/issues/73125#issuecomment-771092741
Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
fork fails there. The failure message is confusing: so c.status()
returns an Err, the closure panics, and the test thinks the panic was
propagated from inside the child.
Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
Co-authored-by: Mara Bos <m-ou.se@m-ou.se>
Rearrange SGX split module files
In #75979 several inlined modules were split out into multiple files.
This PR keeps the multiple files but moves a few things around to
organize things in a coherent way.
Cleanup of `wasm`
Some more cleanup of `sys`, this time `wasm`
- Reuse `unsupported::args` (functionally equivalent implementation, just an empty iterator).
- Split out `atomics` implementation of `wasm::thread`, the non-`atomics` implementation is reused from `unsupported`.
- Move all of the `atomics` code to a separate directory `wasm/atomics`.
````@rustbot```` label: +T-libs-impl
r? ````@m-ou-se````
In #75979 several inlined modules were split out into multiple files.
This PR keeps the multiple files but moves a few things around to
organize things in a coherent way.
This tests that we can indeed safely panic after fork, both
a raw libc::fork and in a Command pre_exec hook.
Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
Co-authored-by: Mara Bos <m-ou.se@m-ou.se>
This is safe (does not involve heap allocation) but we don't yet have
a test to ensure that stays true. That will come in a moment.
Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
Co-authored-by: Mara Bos <m-ou.se@m-ou.se>
Unwinding after fork() in the child is UB on some platforms, because
on those (including musl) malloc can be UB in the child of a
multithreaded program, and unwinding must box for the payload.
Even if it's safe, unwinding past fork() in the child causes whatever
traps the unwind to return twice. This is very strange and clearly
not desirable. With the default behaviour of the thread library, this
can even result in a panic in the child being transformed into zero
exit status (ie, success) as seen in the parent!
Fixes#79740.
Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
Replace 'NULL' with 'null'
This replaces occurrences of "NULL" with "null" in docs, comments, and compiler error/lint messages. This is for the sake of consistency, as the lowercase "null" is already the dominant form in Rust. The all-caps NULL looks like the C macro (or SQL keyword), which seems out of place in a Rust context, given that NULL does not exist in the Rust language or standard library (instead having [`ptr::null()`](https://doc.rust-lang.org/stable/std/ptr/fn.null.html)).
Be stricter about rejecting LLVM reserved registers in asm!
LLVM will silently produce incorrect code if these registers are used as operands.
cc `@rust-lang/wg-inline-asm`
Add std::os::unix::fs::chroot to change the root directory of the current process
This is a straightforward wrapper that uses the existing helpers for C
string handling and errno handling.
Having this available is convenient for UNIX utility programs written in
Rust, and avoids having to call the unsafe `libc::chroot` directly and
handle errors manually, in a program that may otherwise be entirely safe
code.
Reuse `sys::unix::cmath` on other platforms
Reuse `sys::unix::cmath` on all non-`windows` platforms.
`unix` is chosen as the canonical location instead of `unsupported` or `common` because `unsupported` doesn't make sense semantically and `common` is reserved for code that is supported on all platforms. Also `unix` is already the home of some non-`windows` code that is technically not exclusive to `unix` like `unix::path`.
This is a straightforward wrapper that uses the existing helpers for C
string handling and errno handling.
Having this available is convenient for UNIX utility programs written in
Rust, and avoids having to call the unsafe `libc::chroot` directly and
handle errors manually, in a program that may otherwise be entirely safe
code.
Reuse modules on `hermit`
Reuse the following modules on `hermit`:
- `unix::path` (contents identical)
- `unsupported::io` (contents identical)
- `unsupported::thread_local_key` (contents functionally identical, only changes are the panic error messages)
`@rustbot` label: +T-libs-impl
Inline most raw socket, fd and handle conversions
Now that file descriptor types on Unix have niches, it is advantageous for user libraries which provide file descriptor wrappers (e.g. `Socket` from socket2) to store a `File` internally instead of a `RawFd`, so that the niche can be taken advantage of. However, doing so will currently result in worse performance as `IntoRawFd`, `FromRawFd` and `AsRawFd` are not inlined. This change adds `#[inline]` to those methods on std types that wrap file descriptors, handles or sockets.
Rework `init` and `cleanup`
This PR reworks the code in `std` that runs before and after `main` and centralizes this code respectively in the functions `init` and `cleanup` in both `sys_common` and `sys`. This makes is easy to see what code is executed during initialization and cleanup on each platform just by looking at e.g. `sys::windows::init`.
Full list of changes:
- new module `rt` in `sys_common` to contain `init` and `cleanup` and the runtime macros.
- `at_exit` and the mechanism to register exit handlers has been completely removed. In practice this was only used for closing sockets on windows and flushing stdout, which have been moved to `cleanup`.
- <s>On windows `alloc` and `net` initialization is now done in `init`, this saves a runtime check in every allocation and network use.</s>
Remove `sys::args::Args::inner_debug` and use `Debug` instead
This removes the method `sys::args::Args::inner_debug` on all platforms and implements `Debug` for `Args` instead.
I believe this creates a more natural API for the different platforms under `sys`: export a type `Args: Debug + Iterator + ...` vs. `Args: Iterator + ...` and with a method `inner_debug`.
Move `sys_common::rwlock::StaticRWLock` etc. to `sys::unix::rwlock`
This moves `sys_common::rwlock::StaticRwLock`, `RWLockReadGuard` and `RWLockWriteGuard` to `sys::unix::rwlock`. They are already `#[cfg(unix)]` and don't need to be in `sys_common`.
Replace `Void` in `sys` with never type
This PR replaces several occurrences in `sys` of the type `enum Void {}` with the Rust never type (`!`).
The name `Void` is unfortunate because in other languages (C etc.) it refers to a unit type, not an uninhabited type.
Note that the previous stabilization of the never type was reverted, however all uses here are implementation details and not publicly visible.
Fix join_paths error display.
On unix, the error from `join_paths` looked like this:
```
path segment contains separator `58`
```
This PR changes it to look like this:
```
path segment contains separator `:`
```
Fix stack overflow detection on FreeBSD 11.1+
Beginning with FreeBSD 10.4 and 11.1, there is one guard page by
default. And the stack autoresizes, so if Rust allocates its own guard
page, then FreeBSD's will simply move up one page. The best solution is
to just use the OS's guard page.
Rework `std::sys::windows::alloc`
I came across https://github.com/rust-lang/rust/pull/76676#discussion_r488729990, which points out that there was unsound code in the Windows alloc code, creating a &mut to possibly uninitialized memory. I reworked the code so that that particular issue does not occur anymore, and started adding more documentation and safety comments.
Full list of changes:
- moved and documented the relevant Windows Heap API functions
- refactor `allocate_with_flags` to `allocate` (and remove the other helper functions), which now takes just a `bool` if the memory should be zeroed
- add checks for if `GetProcessHeap` returned null
- add a test that checks if the size and alignment of a `Header` are indeed <= `MIN_ALIGN`
- add `#![deny(unsafe_op_in_unsafe_fn)]` and the necessary unsafe blocks with safety comments
I feel like I may have overdone the documenting, the unsoundness fix is the most important part; I could spit this PR up in separate parts.
Beginning with FreeBSD 10.4 and 11.1, there is one guard page by
default. And the stack autoresizes, so if Rust allocates its own guard
page, then FreeBSD's will simply move up one page. The best solution is
to just use the OS's guard page.
unix: Fix feature(unix_socket_ancillary_data) on macos and other BSDs
This adds support for CMSG handling on macOS and fixes it on OpenBSD and possibly other BSDs.
When traversing the CMSG list, the previous code had an exception for Android where the next element after the last pointer could point to the first pointer instead of NULL. This is actually not specific to Android: the `libc::CMSG_NXTHDR` implementation for Linux and emscripten have a special case to return NULL when the length of the previous element is zero; most other implementations simply return the previous element plus a zero offset in this case.
This MR makes the check non-optional which fixes CMSG handling and a possible endless loop on such systems; tested with file descriptor passing on OpenBSD, Linux, and macOS.
This MR additionally adds `SocketAncillary::is_empty` because clippy is right that it should be added.
This belongs to the `feature(unix_socket_ancillary_data)` tracking issue: https://github.com/rust-lang/rust/issues/76915
r? `@joshtriplett`
Improve fs error open_from unix
Consistency for #79399
Suggested by JohnTitor
r? `@JohnTitor`
Not user if the error is too long now, do we handle long errors well?
Add function core::iter::zip
This makes it a little easier to `zip` iterators:
```rust
for (x, y) in zip(xs, ys) {}
// vs.
for (x, y) in xs.into_iter().zip(ys) {}
```
You can `zip(&mut xs, &ys)` for the conventional `iter_mut()` and
`iter()`, respectively. This can also support arbitrary nesting, where
it's easier to see the item layout than with arbitrary `zip` chains:
```rust
for ((x, y), z) in zip(zip(xs, ys), zs) {}
for (x, (y, z)) in zip(xs, zip(ys, zs)) {}
// vs.
for ((x, y), z) in xs.into_iter().zip(ys).zip(xz) {}
for (x, (y, z)) in xs.into_iter().zip((ys.into_iter().zip(xz)) {}
```
It may also format more nicely, especially when the first iterator is a
longer chain of methods -- for example:
```rust
iter::zip(
trait_ref.substs.types().skip(1),
impl_trait_ref.substs.types().skip(1),
)
// vs.
trait_ref
.substs
.types()
.skip(1)
.zip(impl_trait_ref.substs.types().skip(1))
```
This replaces the tuple-pair `IntoIterator` in #78204.
There is prior art for the utility of this in [`itertools::zip`].
[`itertools::zip`]: https://docs.rs/itertools/0.10.0/itertools/fn.zip.html
ExitStatus: print "exit status: {}" rather than "exit code: {}" on unix
Proper Unix terminology is "exit status" (vs "wait status"). "exit
code" is imprecise on Unix and therefore unclear. (As far as I can
tell, "exit code" is correct terminology on Windows.)
This new wording is unfortunately inconsistent with the identifier
names in the Rust stdlib.
It is the identifier names that are wrong, as discussed at length in eg
https://doc.rust-lang.org/nightly/std/process/struct.ExitStatus.htmlhttps://doc.rust-lang.org/nightly/std/os/unix/process/trait.ExitStatusExt.html
Unfortunately for API stability reasons it would be a lot of work, and
a lot of disruption, to change the names in the stdlib (eg to rename
`std::process::ExitStatus` to `std::process::ChildStatus` or
something), but we should fix the message output. Many (probably
most) readers of these messages about exit statuses will be users and
system administrators, not programmers, who won't even know that Rust
has this wrong terminology.
So I think the right thing is to fix the documentation (as I have
already done) and, now, the terminology in the implementation.
This is a user-visible change to the behaviour of all Rust programs
which run Unix subprocesses. Hopefully no-one is matching against the
exit status string, except perhaps in tests.
This adds support for CMSG handling on macOS and fixes it on OpenBSD
and other BSDs.
When traversing the CMSG list, the previous code had an exception for
Android where the next element after the last pointer could point to
the first pointer instead of NULL. This is actually not specific to
Android: the `libc::CMSG_NXTHDR` implementation for Linux and
emscripten have a special case to return NULL when the length of the
previous element is zero; most other implementations simply return the
previous element plus a zero offset in this case.
This MR additionally adds `SocketAncillary::is_empty` because clippy
is right that it should be added.
Proper Unix terminology is "exit status" (vs "wait status"). "exit
code" is imprecise on Unix and therefore unclear. (As far as I can
tell, "exit code" is correct terminology on Windows.)
This new wording is unfortunately inconsistent with the identifier
names in the Rust stdlib.
It is the identifier names that are wrong, as discussed at length in eg
https://doc.rust-lang.org/nightly/std/process/struct.ExitStatus.htmlhttps://doc.rust-lang.org/nightly/std/os/unix/process/trait.ExitStatusExt.html
Unfortunately for API stability reasons it would be a lot of work, and
a lot of disruption, to change the names in the stdlib (eg to rename
`std::process::ExitStatus` to `std::process::ChildStatus` or
something), but we should fix the message output. Many (probably
most) readers of these messages about exit statuses will be users and
system administrators, not programmers, who won't even know that Rust
has this wrong terminology.
So I think the right thing is to fix the documentation (as I have
already done) and, now, the terminology in the implementation.
This is a user-visible change to the behaviour of all Rust programs
which run Unix subprocesses. Hopefully no-one is matching against the
exit status string, except perhaps in tests.
Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
Add internal io::Error::new_const to avoid allocations.
This makes it possible to have a io::Error containing a message with zero allocations, and uses that everywhere to avoid the *three* allocations involved in `io::Error::new(kind, "message")`.
The function signature isn't perfect, because it needs a reference to the `&str`. So for now, this is just a `pub(crate)` function. Later, we'll be able to use `fn new_const<MSG: &'static str>(kind: ErrorKind)` to make that a bit better. (Then we'll also be able to use some ZST trickery if that would result in more efficient code.)
See https://github.com/rust-lang/rust/issues/83352