Fix stack overflow detection on FreeBSD 11.1+
Beginning with FreeBSD 10.4 and 11.1, there is one guard page by
default. And the stack autoresizes, so if Rust allocates its own guard
page, then FreeBSD's will simply move up one page. The best solution is
to just use the OS's guard page.
Rework `std::sys::windows::alloc`
I came across https://github.com/rust-lang/rust/pull/76676#discussion_r488729990, which points out that there was unsound code in the Windows alloc code, creating a &mut to possibly uninitialized memory. I reworked the code so that that particular issue does not occur anymore, and started adding more documentation and safety comments.
Full list of changes:
- moved and documented the relevant Windows Heap API functions
- refactor `allocate_with_flags` to `allocate` (and remove the other helper functions), which now takes just a `bool` if the memory should be zeroed
- add checks for if `GetProcessHeap` returned null
- add a test that checks if the size and alignment of a `Header` are indeed <= `MIN_ALIGN`
- add `#![deny(unsafe_op_in_unsafe_fn)]` and the necessary unsafe blocks with safety comments
I feel like I may have overdone the documenting, the unsoundness fix is the most important part; I could spit this PR up in separate parts.
Fix comment typo in once.rs
I believe I came across a minor typo in a comment. I am not particularly familiar with this part of the codebase, but I have read the surrounding code as well as the referenced `park` and `unpark` functions, and I believe my proposed change is true to the intended meaning of the comment.
I intentionally tried to keep the change as minimal as possible. If I have the maintainers' permission, I'd also love to add a comma to improve readability as follows: `Luckily ``park`` comes with the guarantee that if it got an ``unpark`` just before on an unparked thread, it does not park.`
Rename `#[doc(spotlight)]` to `#[doc(notable_trait)]`
Fixes#80936.
"spotlight" is not a very specific or self-explaining name.
Additionally, the dialog that it triggers is called "Notable traits".
So, "notable trait" is a better name.
* Rename `#[doc(spotlight)]` to `#[doc(notable_trait)]`
* Rename `#![feature(doc_spotlight)]` to `#![feature(doc_notable_trait)]`
* Update documentation
* Improve documentation
r? `@Manishearth`
Beginning with FreeBSD 10.4 and 11.1, there is one guard page by
default. And the stack autoresizes, so if Rust allocates its own guard
page, then FreeBSD's will simply move up one page. The best solution is
to just use the OS's guard page.
Disallow octal format in Ipv4 string
In its original specification, leading zero in Ipv4 string is interpreted
as octal literals. So a IP address 0127.0.0.1 actually means 87.0.0.1.
This confusion can lead to many security vulnerabilities. Therefore, in
[IETF RFC 6943], it suggests to disallow octal/hexadecimal format in Ipv4
string all together.
Existing implementation already disallows hexadecimal numbers. This commit
makes Parser reject octal numbers.
Fixes#83648.
[IETF RFC 6943]: https://tools.ietf.org/html/rfc6943#section-3.1.1
In its original specification, leading zero in Ipv4 string is interpreted
as octal literals. So a IP address 0127.0.0.1 actually means 87.0.0.1.
This confusion can lead to many security vulnerabilities. Therefore, in
[IETF RFC 6943], it suggests to disallow octal/hexadecimal format in Ipv4
string all together.
Existing implementation already disallows hexadecimal numbers. This commit
makes Parser reject octal numbers.
Fixes#83648.
[IETF RFC 6943]: https://tools.ietf.org/html/rfc6943#section-3.1.1
unix: Fix feature(unix_socket_ancillary_data) on macos and other BSDs
This adds support for CMSG handling on macOS and fixes it on OpenBSD and possibly other BSDs.
When traversing the CMSG list, the previous code had an exception for Android where the next element after the last pointer could point to the first pointer instead of NULL. This is actually not specific to Android: the `libc::CMSG_NXTHDR` implementation for Linux and emscripten have a special case to return NULL when the length of the previous element is zero; most other implementations simply return the previous element plus a zero offset in this case.
This MR makes the check non-optional which fixes CMSG handling and a possible endless loop on such systems; tested with file descriptor passing on OpenBSD, Linux, and macOS.
This MR additionally adds `SocketAncillary::is_empty` because clippy is right that it should be added.
This belongs to the `feature(unix_socket_ancillary_data)` tracking issue: https://github.com/rust-lang/rust/issues/76915
r? `@joshtriplett`
Improve fs error open_from unix
Consistency for #79399
Suggested by JohnTitor
r? `@JohnTitor`
Not user if the error is too long now, do we handle long errors well?
Add function core::iter::zip
This makes it a little easier to `zip` iterators:
```rust
for (x, y) in zip(xs, ys) {}
// vs.
for (x, y) in xs.into_iter().zip(ys) {}
```
You can `zip(&mut xs, &ys)` for the conventional `iter_mut()` and
`iter()`, respectively. This can also support arbitrary nesting, where
it's easier to see the item layout than with arbitrary `zip` chains:
```rust
for ((x, y), z) in zip(zip(xs, ys), zs) {}
for (x, (y, z)) in zip(xs, zip(ys, zs)) {}
// vs.
for ((x, y), z) in xs.into_iter().zip(ys).zip(xz) {}
for (x, (y, z)) in xs.into_iter().zip((ys.into_iter().zip(xz)) {}
```
It may also format more nicely, especially when the first iterator is a
longer chain of methods -- for example:
```rust
iter::zip(
trait_ref.substs.types().skip(1),
impl_trait_ref.substs.types().skip(1),
)
// vs.
trait_ref
.substs
.types()
.skip(1)
.zip(impl_trait_ref.substs.types().skip(1))
```
This replaces the tuple-pair `IntoIterator` in #78204.
There is prior art for the utility of this in [`itertools::zip`].
[`itertools::zip`]: https://docs.rs/itertools/0.10.0/itertools/fn.zip.html
Improve Debug implementations of Mutex and RwLock.
This improves the Debug implementations of Mutex and RwLock.
They now show the poison flag and use debug_non_exhaustive. (See #67364.)
Derive Debug for io::Chain instead of manually implementing it.
This derives Debug for io::Chain instead of manually implementing it.
The manual implementation has the same bounds, so I don't think there's any reason for a manual implementation. The names used in the derive implementation are even nicer (`first`/`second`) than the manual implementation (`t`/`u`), and include the `done_first` field too.
Fix Debug implementation for RwLock{Read,Write}Guard.
This would attempt to print the Debug representation of the lock that the guard has locked, which will try to lock again, fail, and just print `"<locked>"` unhelpfully.
After this change, this just prints the contents of the mutex, like the other smart pointers (and MutexGuard) do.
MutexGuard had this problem too: https://github.com/rust-lang/rust/issues/57702
ExitStatus: print "exit status: {}" rather than "exit code: {}" on unix
Proper Unix terminology is "exit status" (vs "wait status"). "exit
code" is imprecise on Unix and therefore unclear. (As far as I can
tell, "exit code" is correct terminology on Windows.)
This new wording is unfortunately inconsistent with the identifier
names in the Rust stdlib.
It is the identifier names that are wrong, as discussed at length in eg
https://doc.rust-lang.org/nightly/std/process/struct.ExitStatus.htmlhttps://doc.rust-lang.org/nightly/std/os/unix/process/trait.ExitStatusExt.html
Unfortunately for API stability reasons it would be a lot of work, and
a lot of disruption, to change the names in the stdlib (eg to rename
`std::process::ExitStatus` to `std::process::ChildStatus` or
something), but we should fix the message output. Many (probably
most) readers of these messages about exit statuses will be users and
system administrators, not programmers, who won't even know that Rust
has this wrong terminology.
So I think the right thing is to fix the documentation (as I have
already done) and, now, the terminology in the implementation.
This is a user-visible change to the behaviour of all Rust programs
which run Unix subprocesses. Hopefully no-one is matching against the
exit status string, except perhaps in tests.
The manual implementation has the same bounds, so I don't think there's
any reason for a manual implementation. The names used in the derive
implementation are even nicer (`first`/`second`) than the manual
implementation (`t`/`u`), and include the `done_first` field too.
This would attempt to print the Debug representation of the lock that
the guard has locked, which will try to lock again, fail, and just print
"<locked>" unhelpfully.
After this change, this just prints the contents of the mutex, like the
other smart pointers (and MutexGuard) do.
Add IEEE 754 compliant fmt/parse of -0, infinity, NaN
This pull request improves the Rust float formatting/parsing libraries to comply with IEEE 754's formatting expectations around certain special values, namely signed zero, the infinities, and NaN. It also adds IEEE 754 compliance tests that, while less stringent in certain places than many of the existing flt2dec/dec2flt capability tests, are intended to serve as the beginning of a roadmap to future compliance with the standard. Some relevant documentation is also adjusted with clarifying remarks.
This PR follows from discussion in https://github.com/rust-lang/rfcs/issues/1074, and closes#24623.
The most controversial change here is likely to be that -0 is now printed as -0. Allow me to explain: While there appears to be community support for an opt-in toggle of printing floats as if they exist in the naively expected domain of numbers, i.e. not the extended reals (where floats live), IEEE 754-2019 is clear that a float converted to a string should be capable of being transformed into the original floating point bit-pattern when it satisfies certain conditions (namely, when it is an actual numeric value i.e. not a NaN and the original and destination float width are the same). -0 is given special attention here as a value that should have its sign preserved. In addition, the vast majority of other programming languages not only output `-0` but output `-0.0` here.
While IEEE 754 offers a broad leeway in how to handle producing what it calls a "decimal character sequence", it is clear that the operations a language provides should be capable of round tripping, and it is confusing to advertise the f32 and f64 types as binary32 and binary64 yet have the most basic way of producing a string and then reading it back into a floating point number be non-conformant with the standard. Further, existing documentation suggested that e.g. -0 would be printed with -0 regardless of the presence of the `+` fmt character, but it prints "+0" instead if given such (which was what led to the opening of #24623).
There are other parsing and formatting issues for floating point numbers which prevent Rust from complying with the standard, as well as other well-documented challenges on the arithmetic level, but I hope that this can be the beginning of motion towards solving those challenges.
Document that the SocketAddr memory representation is not stable
Intended to help out with #78802. Work has been put into finding and fixing code that assumes the memory layout of `SocketAddrV4` and `SocketAddrV6`. But it turns out there are cases where new code continues to make the same assumption ([example](96927dc2b7 (diff-917db3d8ca6f862ebf42726b23c72a12b35e584e497ebdb24e474348d7c6ffb6R610-R621))).
The memory layout of a type in `std` is never part of the public API. Unless explicitly stated I guess. But since that is invalidly relied upon by a considerable amount of code for these particular types, it might make sense to explicitly document this. This can be temporary. Once #78802 lands it does not make sense to rely on the layout any longer, and this documentation can also be removed.
This adds support for CMSG handling on macOS and fixes it on OpenBSD
and other BSDs.
When traversing the CMSG list, the previous code had an exception for
Android where the next element after the last pointer could point to
the first pointer instead of NULL. This is actually not specific to
Android: the `libc::CMSG_NXTHDR` implementation for Linux and
emscripten have a special case to return NULL when the length of the
previous element is zero; most other implementations simply return the
previous element plus a zero offset in this case.
This MR additionally adds `SocketAncillary::is_empty` because clippy
is right that it should be added.
Proper Unix terminology is "exit status" (vs "wait status"). "exit
code" is imprecise on Unix and therefore unclear. (As far as I can
tell, "exit code" is correct terminology on Windows.)
This new wording is unfortunately inconsistent with the identifier
names in the Rust stdlib.
It is the identifier names that are wrong, as discussed at length in eg
https://doc.rust-lang.org/nightly/std/process/struct.ExitStatus.htmlhttps://doc.rust-lang.org/nightly/std/os/unix/process/trait.ExitStatusExt.html
Unfortunately for API stability reasons it would be a lot of work, and
a lot of disruption, to change the names in the stdlib (eg to rename
`std::process::ExitStatus` to `std::process::ChildStatus` or
something), but we should fix the message output. Many (probably
most) readers of these messages about exit statuses will be users and
system administrators, not programmers, who won't even know that Rust
has this wrong terminology.
So I think the right thing is to fix the documentation (as I have
already done) and, now, the terminology in the implementation.
This is a user-visible change to the behaviour of all Rust programs
which run Unix subprocesses. Hopefully no-one is matching against the
exit status string, except perhaps in tests.
Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
Add internal io::Error::new_const to avoid allocations.
This makes it possible to have a io::Error containing a message with zero allocations, and uses that everywhere to avoid the *three* allocations involved in `io::Error::new(kind, "message")`.
The function signature isn't perfect, because it needs a reference to the `&str`. So for now, this is just a `pub(crate)` function. Later, we'll be able to use `fn new_const<MSG: &'static str>(kind: ErrorKind)` to make that a bit better. (Then we'll also be able to use some ZST trickery if that would result in more efficient code.)
See https://github.com/rust-lang/rust/issues/83352
"semantic equivalence" is too strong a phrasing here, which is why
actually explaining what kind of circumstances might produce a -0
was chosen instead.
Move `std::sys::unix::platform` to `std::sys::unix::ext`
This moves the operating system dependent alias `platform` (`std::os::{linux, android, ...}`) from `std::sys::unix` to `std::sys::unix::ext` (a.k.a. `std::os::unix`), removing the need for compatibility code in `unix_ext` when documenting on another platform.
This is also a step in making it possible to properly move `std::sys::unix::ext` to `std::os::unix`, as ideally `std::sys` should not depend on the rest of `std`.
Deprecate std::os::haiku::raw, which accidentally wasn't deprecated
In early 2016, all `std::os::*::raw` modules [were deprecated](aa23c98450) in accordance with [RFC 1415](https://github.com/rust-lang/rfcs/blob/master/text/1415-trim-std-os.md). However, at this same time support for Haiku was being added to libstd, landing shortly after the aforementioned commit, and due to some crossed wires a `std::os::haiku::raw` module was added and was not marked as deprecated.
I have been in correspondence with the author of the Haiku patch, ````@nielx,```` who has confirmed that this was simply an oversight and that the definitions from the libc crate should be preferred instead.
Clarify docs for Read::read's return value
Right now the docs for `Read::read`'s return value are phrased in a way that makes it easy for the reader to assume that the return value is never larger than the passed buffer. This PR clarifies that this is a requirement for implementations of the trait, but that callers have to expect a buggy yet safe implementation failing to do so, especially if unchecked accesses to the buffer are done afterwards.
I fell into this trap recently, and when I noticed, I looked at the docs again and had the feeling that I might not have been the first one to miss this.
The same issue of trusting the return value of `read` was also present in std itself for about 2.5 years and only fixed recently, see #80895.
I hope that clarifying the docs might help others to avoid this issue.
Reuse `std::sys::unsupported::pipe` on `hermit`
Pipes are not supported on `hermit` and `hermit/pipe.rs` is identical to `unsupported/pipe.rs`. This PR reduces duplication between the two by doing the following on `hermit`:
```rust
#[path = "../unsupported/pipe.rs"]
pub mod pipe;
```
Add more links between hash and btree collections
- Link from `core::hash` to `HashMap` and `HashSet`
- Link from HashMap and HashSet to the module-level documentation on
when to use the collection
- Link from several collections to Wikipedia articles on the general
concept
See also https://github.com/rust-lang/rust/pull/81989#issuecomment-783920840.
Deprecate `intrinsics::drop_in_place` and `collections::Bound`, which accidentally weren't deprecated
Fixes#82080.
I've taken the liberty of updating the `since` values to 1.52, since an unobservable deprecation isn't much of a deprecation (even the detailed release notes never bothered to mention these deprecations).
As mentioned in the issue I'm *pretty* sure that using a type alias for `Bound` is semantically equivalent to the re-export; [the reference implies](https://doc.rust-lang.org/reference/items/type-aliases.html) that type aliases only observably differ from types when used on unit structs or tuple structs, whereas `Bound` is an enum.
Deprecate RustcEncodable and RustcDecodable.
We can't remove the `RustcEncodable` and `RustcDecodable` derive macros from the prelude, but we can deprecate them.
Added `try_exists()` method to `std::path::Path`
This method is similar to the existing `exists()` method, except it
doesn't silently ignore the errors, leading to less error-prone code.
This change intentionally does NOT touch the documentation of `exists()`
nor recommend people to use this method while it's unstable.
Such changes are reserved for stabilization to prevent confusing people.
Apart from that it avoids conflicts with #80979.
`@joshtriplett` requested this PR in [internals discussion](https://internals.rust-lang.org/t/the-api-of-path-exists-encourages-broken-code/13817/25?u=kixunil)
"spotlight" is not a very specific or self-explaining name.
Additionally, the dialog that it triggers is called "Notable traits".
So, "notable trait" is a better name.
* Rename `#[doc(spotlight)]` to `#[doc(notable_trait)]`
* Rename `#![feature(doc_spotlight)]` to `#![feature(doc_notable_trait)]`
* Update documentation
* Improve documentation
use RWlock when accessing os::env (take 2)
This reverts commit acdca316c3 (#82877) i.e. redoes #81850 since the invalid unlock attempts in the child process have been fixed in #82949
r? `@joshtriplett`
Demonstrate best practice for feeding stdin of a child processes
Documentation change.
It's possible to create a deadlock with stdin/stdout I/O on a single thread:
* the child process may fill its stdout buffer, and have to wait for the parent process to read it,
* but the parent process may be waiting until its stdin write finishes before reading the stdout.
Therefore, the parent process should use separate threads for writing and reading.
These examples are not deadlocking in practice, because they use short strings, but I think it's better to demonstrate code that works even for long writes. The problem is non-obvious and tricky to debug (it seems that even libstd has a similar issue: #45572).
This also demonstrates how to use stdio with threads: it's not obvious that `.take()` can be used to avoid fighting with the borrow checker.
I've checked that the modified examples run fine.
std: Fix a bug on the wasm32-wasi target opening files
This commit fixes an issue pointed out in #82758 where LTO changed the
behavior of a program. It turns out that LTO was not at fault here, it
simply uncovered an existing bug. The bindings to
`__wasilibc_find_relpath` assumed that the relative portion of the path
returned was always contained within thee input `buf` we passed in. This
isn't actually the case, however, and sometimes the relative portion of
the path may reference a sub-portion of the input string itself.
The fix here is to use the relative path pointer coming out of
`__wasilibc_find_relpath` as the source of truth. The `buf` used for
local storage is discarded in this function and the relative path is
copied out unconditionally. We might be able to get away with some
`Cow`-like business or such to avoid the extra allocation, but for now
this is probably the easiest patch to fix the original issue.
Implement Extend and FromIterator for OsString
Add the following trait impls:
- `impl Extend<OsString> for OsString`
- `impl<'a> Extend<&'a OsStr> for OsString`
- `impl FromIterator<OsString> for OsString`
- `impl<'a> FromIterator<&'a OsStr> for OsString`
Because `OsString` is a platform string with no particular semantics, concatenating them together seems acceptable.
I came across a use case for these trait impls in https://github.com/artichoke/artichoke/pull/1089:
Artichoke is a Ruby interpreter. Its CLI accepts multiple `-e` switches for executing inline Ruby code, like:
```console
$ cargo -q run --bin artichoke -- -e '2.times {' -e 'puts "foo: #{__LINE__}"' -e '}'
foo: 2
foo: 2
```
I use `clap` for command line argument parsing, which collects these `-e` commands into a `Vec<OsString>`. To pass these commands to the interpreter for `Eval`, I need to join them together. Combining these impls with `Iterator::intersperse` https://github.com/rust-lang/rust/issues/79524 would enable me to build a single bit of Ruby code.
Currently, I'm doing something like:
```rust
let mut commands = commands.into_iter();
let mut buf = if let Some(command) = commands.next() {
command
} else {
return Ok(Ok(()));
};
for command in commands {
buf.push("\n");
buf.push(command);
}
```
If there's interest, I'd also like to add impls for `Cow<'a, OsStr>`, which would avoid allocating the `"\n"` `OsString` in the concatenate + intersperse use case.
Fix io::copy specialization using copy_file_range when writer was opened with O_APPEND
fixes#82410
While `sendfile()` returns `EINVAL` when the output was opened with O_APPEND, `copy_file_range()` does not and returns `EBADF` instead, which – unlike other `EBADF` causes – is not fatal for this operation since a regular `write()` will likely succeed.
We now treat `EBADF` as a non-fatal error for `copy_file_range` and fall back to a read-write copy as we already did for several other errors.
Do not attempt to unlock envlock in child process after a fork.
This implements the first two points from https://github.com/rust-lang/rust/issues/64718#issuecomment-793030479
This is a breaking change for cases where the environment is accessed in a Command::pre_exec closure. Except for single-threaded programs these uses were not correct anyway since they aren't async-signal safe.
Note that we had a ui test that explicitly tried `env::set_var` in `pre_exec`. As expected it failed with these changes when I tested locally.
Edition-specific preludes
This changes `{std,core}::prelude` to export edition-specific preludes under `rust_2015`, `rust_2018` and `rust_2021`. (As suggested in https://github.com/rust-lang/rust/issues/51418#issuecomment-395630382.) For now they all just re-export `v1::*`, but this allows us to add things to the 2021edition prelude soon.
This also changes the compiler to make the automatically injected prelude import dependent on the selected edition.
cc `@rust-lang/libs` `@djc`