`crate::` -> `core::`
It looks weird to have `crate::` in the link text and we use the actual
crate name everywhere else.
If anyone is curious, I used this Vim command to update all the links:
%s/\(\s\)\[`crate::\(.*\)`\]/\1[`core::\2`](crate::\2)/g
Add fetch_update methods to AtomicBool and AtomicPtr
These methods were stabilized for the integer atomics in #71843, but the methods were not added for the non-integer atomics `AtomicBool` and `AtomicPtr`.
Point out that total_cmp is no strict superset of partial comparison
Partial comparison and total_cmp are not equal. This helps
preventing the mistake of creating float wrappers that
base their Ord impl on total_cmp and their PartialOrd impl on
the PartialOrd impl of the float type. PartialOrd and Ord
[are required to agree with each other](https://doc.rust-lang.org/std/cmp/trait.Ord.html#how-can-i-implement-ord).
Trivial fixes to bitwise operator documentation
Added fixes to documentation of `BitAnd`, `BitOr`, `BitXor` and
`BitAndAssign`, where the documentation for implementation on
`Vector<bool>` was using logical operators in place of the bitwise
operators.
r? @steveklabnik
Closes#78619
Clarify handling of final line ending in str::lines()
I found the description as it stands a bit confusing. I've added a bit more explanation to make it clear that a trailing line ending does not produce a final empty line.
These methods were stabilized for the integer atomics in #71843, but the methods
were not added for the non-integer atomics `AtomicBool` and `AtomicPtr`.
Partial comparison and total_cmp are not equal. This helps
preventing the mistake of creating float wrappers that
base their Ord impl on total_cmp and their PartialOrd impl on
the PartialOrd impl of the float type. PartialOrd and Ord
are required to agree with each other.
fix various aliasing issues in the standard library
This fixes various cases where the standard library either used raw pointers after they were already invalidated by using the original reference again, or created raw pointers for one element of a slice and used it to access neighboring elements.
Fix doc links to std::fmt
`std::format` and `core::write` macros' docs linked to `core::fmt` for format string reference, even though only `std::fmt` has format string documentation (and the link titles were `std::fmt`)
Added fixes to documentation of `BitAnd`, `BitOr`, `BitXor` and
`BitAndAssign`, where the documentation for implementation on
`Vector<bool>` was using logical operators in place of the bitwise
operators.
r? @steveklabnik
cc #78619
I found the description as it stands a bit confusing. I've added a bit more explanation to make it clear that a trailing line ending does not produce a final empty line.
std::format and core::write macros' docs linked to core::fmt for format string reference, even though only std::fmt has format string documentation and the link titles were std::fmt.
Improve documentation for slice strip_* functions
Prompted by the stabilisation tracking issue #73413 I looked at the docs for `strip_prefix` and `strip_suffix` for both `str` and `slice`, and I felt they could be slightly improved.
Thanks for your attention.
Separate unsized locals
Closes#71694
Takes over again #72029 and #74971
cc @RalfJung @oli-obk @pnkfelix @eddyb as they've participated in previous reviews of this PR.
The lint checks arguments in calls to `transmute` or functions that have
`Pointer` as a trait bound and displays a warning if the argument is a function
reference. Also checks for `std::fmt::Pointer::fmt` to handle formatting macros
although it doesn't depend on the exact expansion of the macro or formatting
internals. `std::fmt::Pointer` and `std::fmt::Pointer::fmt` were also added as
diagnostic items and symbols.
Add [T]::as_chunks(_mut)
Allows getting the slices directly, rather than just through an iterator as in `array_chunks(_mut)`. The constructors for those iterators are then written in terms of these methods, so the iterator constructors no longer have any `unsafe` of their own.
Unstable, of course. #74985
Optimise align_offset for stride=1 further
`stride == 1` case can be computed more efficiently through `-p (mod
a)`. That, then translates to a nice and short sequence of LLVM
instructions:
%address = ptrtoint i8* %p to i64
%negptr = sub i64 0, %address
%offset = and i64 %negptr, %a_minus_one
And produces pretty much ideal code-gen when this function is used in
isolation.
Typical use of this function will, however, involve use of
the result to offset a pointer, i.e.
%aligned = getelementptr inbounds i8, i8* %p, i64 %offset
This still looks very good, but LLVM does not really translate that to
what would be considered ideal machine code (on any target). For example
that's the codegen we obtain for an unknown alignment:
; x86_64
dec rsi
mov rax, rdi
neg rax
and rax, rsi
add rax, rdi
In particular negating a pointer is not something that’s going to be
optimised for in the design of CISC architectures like x86_64. They
are much better at offsetting pointers. And so we’d love to utilize this
ability and produce code that's more like this:
; x86_64
lea rax, [rsi + rdi - 1]
neg rsi
and rax, rsi
To achieve this we need to give LLVM an opportunity to apply its
various peep-hole optimisations that it does during DAG selection. In
particular, the `and` instruction appears to be a major inhibitor here.
We cannot, sadly, get rid of this load-bearing operation, but we can
reorder operations such that LLVM has more to work with around this
instruction.
One such ordering is proposed in #75579 and results in LLVM IR that
looks broadly like this:
; using add enables `lea` and similar CISCisms
%offset_ptr = add i64 %address, %a_minus_one
%mask = sub i64 0, %a
%masked = and i64 %offset_ptr, %mask
; can be folded with `gepi` that may follow
%offset = sub i64 %masked, %address
…and generates the intended x86_64 machine code.
One might also wonder how the increased amount of code would impact a
RISC target. Turns out not much:
; aarch64 previous ; aarch64 new
sub x8, x1, #1 add x8, x1, x0
neg x9, x0 sub x8, x8, #1
and x8, x9, x8 neg x9, x1
add x0, x0, x8 and x0, x8, x9
(and similarly for ppc, sparc, mips, riscv, etc)
The only target that seems to do worse is… wasm32.
Onto actual measurements – the best way to evaluate snipets like these
is to use llvm-mca. Much like Aarch64 assembly would allow to suspect,
there isn’t any performance difference to be found. Both snippets
execute in same number of cycles for the CPUs I tried. On x86_64,
we get throughput improvement of >50%!
Fixes#75579
Properly define va_arg and va_list for aarch64-apple-darwin
From [Apple][]:
> Because of these changes, the type `va_list` is an alias for `char*`,
> and not for the struct type in the generic procedure call standard.
With this change `/x.py test --stage 1 src/test/ui/abi/variadic-ffi`
passes.
Fixes#78092
[Apple]: https://developer.apple.com/documentation/xcode/writing_arm64_code_for_apple_platforms
transmute_copy: explain that alignment is handled correctly
The doc comment currently is somewhat misleading because if it actually transmuted `&T` to `&U`, a higher-aligned `U` would be problematic.
replace `#[allow_internal_unstable]` with `#[rustc_allow_const_fn_unstable]` for `const fn`s
`#[allow_internal_unstable]` is currently used to side-step feature gate and stability checks.
While it was originally only meant to be used only on macros, its use was expanded to `const fn`s.
This pr adds stricter checks for the usage of `#[allow_internal_unstable]` (only on macros) and introduces the `#[rustc_allow_const_fn_unstable]` attribute for usage on `const fn`s.
This pr does not change any of the functionality associated with the use of `#[allow_internal_unstable]` on macros or the usage of `#[rustc_allow_const_fn_unstable]` (instead of `#[allow_internal_unstable]`) on `const fn`s (see https://github.com/rust-lang/rust/issues/69399#issuecomment-712911540).
Note: The check for `#[rustc_allow_const_fn_unstable]` currently only validates that the attribute is used on a function, because I don't know how I would check if the function is a `const fn` at the place of the check. I therefore openend this as a 'draft pull request'.
Closesrust-lang/rust#69399
r? @oli-obk