Clean up warnings + `unsafe_op_in_unsafe_fn` when building std for armv6k-nintendo-3ds
See #127747
ping `@AzureMarker` `@Meziu`
I could only find one instance needing an extra `unsafe` that was not also shared with many other `unix` targets (presumably these will get covered in larger sweeping changes, I didn't want to introduce churn that would potentially conflict with those). The one codepath I found is shared with `vita` however, so also pinging `@nikarh` `@pheki` `@zetanumbers` just to make sure they're aware of this change.
Also removed one unused import from `process_unsupported` which should simply fix the warning for any target that uses it.
Deal with invalid UTF-8 from `gai_strerror`
When the system is using a non-UTF-8 locale, the value will indeed not be UTF-8. That sucks for everyone involved, but is no reason for panic. We can "handle" this gracefully by just using from lossy, replacing the invalid UTF-8 with � and keeping the accidentally valid UTF-8. Good luck when debugging, but at least it's not a crash.
We already do this for `strerror_r`.
fixes#127563
When the system is using a non-UTF-8 locale, the value will indeed not
be UTF-8. That sucks for everyone involved, but is no reason for panic.
We can "handle" this gracefully by just using from lossy, replacing the
invalid UTF-8 with the ? and keeping the accidentally valid UTF-8.
Good luck when debugging, but at least it's not a crash.
We already do this for `strerror_r`.
Windows: move BSD socket shims to netc
On Windows we need to alter a few types so that they can be used in the cross-platform socket code. Currently these alterations are spread throughout the `c` module with some more in the `netc` module.
Let's gather all our BSD compatibility shims in the `netc` module so it's all in one place and easier to discover.
kmc-solid: `#![forbid(unsafe_op_in_unsafe_fn)]`
The path logic _should_ handle the forbiddance in the itron sources correctly, despite them being an "out-of-line" module.
unix: document unsafety for std `sig{action,altstack}`
I found many surprising elements here while trying to wrap a measly 5 functions with `unsafe`. I would rather not "just" mindlessly wrap this code with `unsafe { }`, so I decided to document it properly.
On Unix, this code covers the "create and setup signal handler" part of the stack overflow code, and serves as the primary safety boundary for the signal handler. It is rarely audited, very gnarly, and worth extra attention. It calls other unsafe functions defined in this module, but "can we correctly map the right memory, or find the right address ranges?" are separate questions, and get increasingly platform-specific. The question here is the more general "are we doing everything in the correct order, and setting up the handler in the correct way?"
As part of this audit, I noticed that we do some peculiar things that we should probably refrain from. However, I avoided making changes that I deemed might have a different final result in Rust programs. I did, however, reorder some events so that the signal handler is installed _after_ we install the alternate stack. We do not run much code between these events, but it is probably best if the timespan between the handler being available and the new stack being installed is 0 nanoseconds.
Safely enforce thread name requirements
The requirements for the thread name to be both UTF-8 and null terminated are easily enforced by a wrapper type so lets do that. The fact this used to be just a bare `CString` has tripped me up before because it was entirely safe to use a non UTF-8 `CString`.
Add Process support for UEFI
UEFI does not have an actual process. However, it does provide methods to launch and execute another UEFI image. Having process support is important since it is possible to run rust test suit using `Command::output` and is the first step towards being able to run it for UEFI.
Here is an overview of how the support is implemented.
- We create a copy of the SystemTable. This is required since at least OVMF seems to crash if the original system table is modified.
- Stdout and Stderr pipe works by registering a new `simple_text_output` Protocol and pointing the child system table to use those.
- `Stdio::Inherit` just points the console to the current running image console which seems to work with even 3 levels of process.
- `spawn` is left unimplemented since it does not make sense for UEFI architecture. Additionally, since https://github.com/rust-lang/rust/pull/105458 was merged, the `spawn` and `output` implementations are completely independent.
Rollup of 6 pull requests
Successful merges:
- #127295 (CFI: Support provided methods on traits)
- #127814 (`C-cmse-nonsecure-call`: improved error messages)
- #127949 (fix: explain E0120 better cover cases when its raised)
- #127966 (Use structured suggestions for unconstrained generic parameters on impl blocks)
- #127976 (Lazy type aliases: Diagostics: Detect bivariant ty params that are only used recursively)
- #127978 (Avoid ref when using format! for perf)
r? `@ghost`
`@rustbot` modify labels: rollup
Avoid ref when using format! for perf
Clean up a few minor refs in `format!` macro, as it has a performance cost. Apparently the compiler is unable to inline `format!("{}", &variable)`, and does a run-time double-reference instead (format macro already does one level referencing). Inlining format args prevents accidental `&` misuse.
- Update system table crc32
- Fix unsound use of Box
- Free exit data
- Code improvements
- Introduce OwnedTable
- Update r-efi to latest version
- Use extended_varargs_abi_support for
install_multiple_protocol_interfaces and
uninstall_multiple_protocol_interfaces
- Fix comments
- Stub out args implementation
Signed-off-by: Ayush Singh <ayushdevel1325@gmail.com>
`use` declarations will be reformatted in #125443. Very rarely, there is
a desire to force a group of `use` declarations together in a way that
auto-formatting will break up. E.g. when you want a single comment to
apply to a group. #126776 dealt with all of these in the codebase,
ensuring that no comments intended for multiple `use` declarations would
end up in the wrong place. But some people were unhappy with it.
This commit uses `#[rustfmt::skip]` to create these custom `use` groups
in an idiomatic way for a few of the cases changed in #126776. This
works because rustfmt treats any `use` item annotated with
`#[rustfmt::skip]` as a barrier and won't reorder other `use` items
around it.
This is technically "not necessary", as we will "just" segfault instead
if we e.g. arrive inside the handler fn with the null altstack. However,
it seems incorrect to go about this hoping that segfaulting is okay,
seeing as how our purpose here is to mitigate stack overflow problems.
Make sure NEED_ALTSTACK syncs with PAGE_SIZE when we do.
Co-authored-by: Jonas Böttiger <jonasboettiger@icloud.com>
Use ThreadId instead of TLS-address in `ReentrantLock`
Fixes#123458
`ReentrantLock` currently uses the address of a thread local variable as an ID that's unique across all currently running threads. This can lead to uninituitive behavior as in #123458 if TLS blocks get reused. This PR changes `ReentrantLock` to instead use the `ThreadId` provided by `std` as the unique ID. `ThreadId` guarantees uniqueness across the lifetime of the whole process, so we don't need to worry about reusing IDs of terminated threads. The main appeal of this PR is thus the possibility of changing the `ReentrantLock` API to guarantee that if a thread leaks a lock guard, no other thread may ever acquire that lock again.
This does entail some complications:
- previously, the only way to retrieve the current thread ID would've been using `thread::current().id()` which creates a temporary `Arc` and which isn't available in TLS destructors. As part of this PR, the thread ID instead gets cached in its own thread local, as suggested [here](https://github.com/rust-lang/rust/issues/123458#issuecomment-2038207704).
- `ThreadId` is always 64-bit whereas the current implementation uses a usize-sized ID. Since this ID needs to be updated atomically, we can't simply use a single atomic variable on 32 bit platforms. Instead, we fall back to using a (sound) seqlock on 32-bit platforms, which works because only one thread at a time can write to the ID. This seqlock is technically susceptible to the ABA problem, but the attack vector to create actual unsoundness has to be very specific:
- You would need to be able to lock+unlock the lock exactly 2^31 times (or a multiple thereof) while a thread trying to lock it sleeps
- The sleeping thread would have to suspend after reading one half of the thread id but before reading the other half
- The teared result from combining the halves of the thread ID would have to exactly line up with the sleeping thread's ID
The risk of this occurring seems slim enough to be acceptable to me, but correct me if I'm wrong. This also means that the size of the lock increases by 8 bytes on 32-bit platforms, but this also shouldn't be an issue.
Performance wise, I did some crude testing of the only case where this could lead to real slowdowns, which is the case of locking a `ReentrantLock` that's already locked by the current thread. On both aarch64 and x86-64, there is (expectedly) pretty much no performance hit. I didn't have any 32-bit platforms to test the seqlock performance on, so I did the next best thing and just forced the 64-bit platforms to use the seqlock implementation. There, the performance degraded by ~1-2ns/(lock+unlock) on x86-64 and ~6-8ns/(lock+unlock) on aarch64, which is measurable but seems acceptable to me seeing as 32-bit platforms should be a small minority anyways.
cc `@joboet` `@RalfJung` `@CAD97`
This changes `ReentrantLock` to use `ThreadId` for the thread ownership check instead of the address of a thread local. Unlike TLS blocks, `ThreadId` is guaranteed to be unique across the lifetime of the process, so if any thread ever terminates while holding a `ReentrantLockGuard`, no other thread may ever acquire that lock again.
On platforms with 64-bit atomics, this is a very simple change. On other platforms, the approach used is slightly more involved, as explained in the module comment.
This also adds a `CURRENT_ID` thread local in addition to the already existing `CURRENT`. This allows us to access the current `ThreadId` without the relatively heavy machinery used by `thread::current().id()`.
Document the column numbers for the dbg! macro
The line numbers were also made consistent, some examples used the line numbers as shown on the playground while others used the line numbers that you would expect when just seeing the documentation.
The second option was chosen to make everything consistent.
unix: break `stack_overflow::install_main_guard` into smaller fn
This was one big deeply-indented function for no reason. This made it hard to reason about the boundaries of its safety. Or just, y'know, read. Simplify it by splitting it into platform-specific functions, but which are still asked to keep compiling (a desirable property, since all of these OS use a similar API).
This is mostly a whitespace change, so I suggest reviewing it only after setting Files changed -> (the options gear) -> [x] Hide whitespace as that will make it easier to see how the code was actually broken up instead of raw line diffs.
Windows: Use futex implementation for `Once`
Keep the queue implementation for win7.
Inspired by PR #121956
<!--
If this PR is related to an unstable feature or an otherwise tracked effort,
please link to the relevant tracking issue here. If you don't know of a related
tracking issue or there are none, feel free to ignore this.
This PR will get automatically assigned to a reviewer. In case you would like
a specific user to review your work, you can assign it to them by using
r? <reviewer name>
-->
The line numbers were also made consistent, some examples used the line numbers as shown on the playground while others used the line numbers that you would expect when just seeing the documentation.
The second option was chosen to make everything consistent.
Prevent double reference in generic futex
In the Windows futex implementation we were a little lax at allowing references to references (i.e. `&&`) which can lead to deadlocks due to reading the wrong memory address. This uses a trait to tighten the constraints and ensure this doesn't happen.
r? libs
Make more Windows functions `#![deny(unsafe_op_in_unsafe_fn)]`
As part of #127747, I've evaluated some more Windows functions and added `unsafe` blocks where necessary. Some are just trivial wrappers that "inherit" the full unsafety of their function, but for others I've added some safety comments. A few functions weren't actually unsafe at all. I think they were just using `unsafe fn` to avoid an `unsafe {}` block.
I'm not touching `c.rs` yet because that is partially being addressed by another PR and also I have plans to further reduce the number of wrapper functions we have in there.
r? libs
This function is purely informative, answering where a stack starts.
This is a safe operation, even if an answer requires unsafe code,
and even if the result is some unsafe code decides to trust the answer.
It also doesn't need to fetch the PAGE_SIZE when its caller just did so!
Let's complicate its signature and in doing so simplify its operation.
This allows sprinkling around #[forbid(unsafe_op_in_unsafe_fn)]
zkvm: add `#[forbid(unsafe_op_in_unsafe_fn)]` in `stdlib`
This also adds an additional `unsafe` block to address compiler errors.
This PR is intended to address https://github.com/rust-lang/rust/issues/127747 for the zkvm target.
Use futex.rs for Windows thread parking
If I'm not overlooking anything then the Windows 10+ thread parking implementation is practically the same as the futex.rs implementation. So we may as well use the same implementation for both. The old version is still kept around for Windows 7 support.
r? ````@joboet```` if you wouldn't mind double checking I've not missed something
std: Use `read_unaligned` for reads from DWARF
There's a lot of... *stuff* going on here. Meanwhile, `read_unaligned` has been available since 1.17.0, so let's just use that.
Clean up more comments near use declarations
#125443 will reformat all use declarations in the repository. There are a few edge cases involving comments on use declarations that require care. This PR fixes them up so #125443 can go ahead with a simple `x fmt --all`. A follow-up to #126717.
r? ``@cuviper``
Simplify environment variable examples
I’ve found myself visiting the documentation for `std::env::vars` every few months, and every time I do, it is because I want to quickly get a snippet to print out all environment variables :-)
So I think it could be nice to simplify the examples a little to make them self-contained. It is of course a style question if one should import a module a not, but I personally don’t import modules used just once in a code snippet.
There are some comments describing multiple subsequent `use` items. When
the big `use` reformatting happens some of these `use` items will be
reordered, possibly moving them away from the comment. With this
additional level of formatting it's not really feasible to have comments
of this type. This commit removes them in various ways:
- merging separate `use` items when appropriate;
- inserting blank lines between the comment and the first `use` item;
- outright deletion (for comments that are relatively low-value);
- adding a separate "top-level" comment.
We also entirely skip formatting for four library files that contain
nothing but `pub use` re-exports, where reordering would be painful.
Make os/windows and pal/windows default to `#![deny(unsafe_op_in_unsafe_fn)]`
This is to prevent regressions in modules that currently pass. I did also fix up a few trivial places where the module contained only one or two simple wrappers. In more complex cases we should try to ensure the `unsafe` blocks are appropriately scoped and have any appropriate safety comments.
This does not fix the windows bits of #127747 but it should help prevent regressions until that is done and also make it more obvious specifically which modules need attention.
std: `#![deny(unsafe_op_in_unsafe_fn)]` in platform-independent code
This applies the `unsafe_op_in_unsafe_fn` lint in all places in std that _do not have platform-specific cfg in their code_. For all such places, the lint remains allowed, because they need further work to address the relevant concerns. This list includes:
- `std::backtrace_rs` (internal-only)
- `std::sys` (internal-only)
- `std::os`
Notably this eliminates all "unwrapped" unsafe operations in `std::io` and `std::sync`, which will make them much more auditable in the future. Such has *also* been left for future work. While I made a few safety comments along the way on interfaces I have grown sufficiently familiar with, in most cases I had no context, nor particular confidence the unsafety was correct.
In the cases where I was able to determine the unsafety was correct without having prior context, it was obviously redundant. For example, an unsafe function calling another unsafe function that has the exact same contract, forwarding its caller's requirements just as it forwards its actual call.
Windows: Remove some unnecessary type aliases
Back in the olden days, C did not have fixed-width types so these type aliases were at least potentially useful. Nowadays, and especially in Rust, we don't need the aliases and they don't help with anything. Notably the windows bindings we use also don't bother with the aliases. And even when we have used aliases they're often only used once then forgotten about.
The only one that gives me pause is `DWORD` because it's used a fair bit. But it's still used inconsistently and we implicitly assume it's a `u32` anyway (e.g. `as` casting from an `i32`).
std: removes logarithms family function edge cases handling for solaris.
Issue had been fixed over time with solaris, 11.x behaves correctly
(and we support it as minimum), illumos works correctly too.