Return values larger than 2 registers using a return area pointer
LLVM and Cranelift disagree about how to return values that don't fit in the registers designated for return values. LLVM will force the entire return value to be passed by return area pointer, while Cranelift will look at each IR level return value independently and decide to pass it in a register or not, which would result in the return value being passed partially in registers and partially through a return area pointer.
While Cranelift may need to be fixed as the LLVM behavior is generally more correct with respect to the surface language, forcing this behavior in rustc itself makes it easier for other backends to conform to the Rust ABI and for the C ABI rustc already handles this behavior anyway.
In addition LLVM's decision to pass the return value in registers or using a return area pointer depends on how exactly the return type is lowered to an LLVM IR type. For example `Option<u128>` can be lowered as `{ i128, i128 }` in which case the x86_64 backend would use a return area pointer, or it could be passed as `{ i32, i128 }` in which case the x86_64 backend would pass it in registers by taking advantage of an LLVM ABI extension that allows using 3 registers for the x86_64 sysv call conv rather than the officially specified 2 registers.
This adjustment is only necessary for the Rust ABI as for other ABI's the calling convention implementations in rustc_target already ensure any return value which doesn't fit in the available amount of return registers is passed in the right way for the current target.
Helps with https://github.com/rust-lang/rustc_codegen_cranelift/issues/1525
cc https://github.com/bytecodealliance/wasmtime/issues/9250
ABI: Pass aggregates by value on AIX
On AIX we pass aggregates byval. Adds new ABI for AIX for powerpc64.
313ad85dfa/clang/lib/CodeGen/Targets/PPC.cpp (L216)
Fixes the following 2 testcases on AIX:
```
tests/ui/abi/extern/extern-pass-TwoU16s.rs
tests/ui/abi/extern/extern-pass-TwoU8s.rs
```
This hack was intended to handle the case where `-Clto=thin` causes the
compiler to emit multiple output files (when producing LLVM-IR or assembly).
The hack only affects 4 tests, of which 3 are just meta-tests for the hack
itself. The one remaining test that motivated the hack currently doesn't even
need it!
(`tests/codegen/issues/issue-81408-dllimport-thinlto-windows.rs`)
Add x86_64-unknown-trusty as tier 3 target
This PR adds a third target for the Trusty platform, `x86_64-unknown-trusty`.
Please let me know if an MCP is required. https://github.com/rust-lang/compiler-team/issues/582 was made when adding the first two targets, I can make another one for the new target as well if needed.
# Target Tier Policy Acknowledgements
> A tier 3 target must have a designated developer or developers (the "target maintainers") on record to be CCed when issues arise regarding the target. (The mechanism to track and CC such developers may evolve over time.)
- Nicole LeGare (```@randomPoison)```
- Andrei Homescu (```@ahomescu)```
- Chris Wailes (chriswailes@google.com)
- As a fallback trusty-dev-team@google.com can be contacted
Note that this does not reflect the maintainers currently listed in [`trusty.md`](c52c23b6f4/src/doc/rustc/src/platform-support/trusty.md). #130452 is currently open to update the list of maintainers in the documentation.
> Targets must use naming consistent with any existing targets; for instance, a target for the same CPU or OS as an existing Rust target should use the same name for that CPU or OS. Targets should normally use the same names and naming conventions as used elsewhere in the broader ecosystem beyond Rust (such as in other toolchains), unless they have a very good reason to diverge. Changing the name of a target can be highly disruptive, especially once the target reaches a higher tier, so getting the name right is important even for a tier 3 target.
The new target `x86_64-unknown-trusty` follows the existing naming convention for similar targets.
> Target names should not introduce undue confusion or ambiguity unless absolutely necessary to maintain ecosystem compatibility. For example, if the name of the target makes people extremely likely to form incorrect beliefs about what it targets, the name should be changed or augmented to disambiguate it.
👍
> Tier 3 targets may have unusual requirements to build or use, but must not create legal issues or impose onerous legal terms for the Rust project or for Rust developers or users.
There are no known legal issues or license incompatibilities.
> Neither this policy nor any decisions made regarding targets shall create any binding agreement or estoppel by any party. If any member of an approving Rust team serves as one of the maintainers of a target, or has any legal or employment requirement (explicit or implicit) that might affect their decisions regarding a target, they must recuse themselves from any approval decisions regarding the target's tier status, though they may otherwise participate in discussions.
👍
> Tier 3 targets should attempt to implement as much of the standard libraries as possible and appropriate (core for most targets, alloc for targets that can support dynamic memory allocation, std for targets with an operating system or equivalent layer of system-provided functionality), but may leave some code unimplemented (either unavailable or stubbed out as appropriate), whether because the target makes it impossible to implement or challenging to implement. The authors of pull requests are not obligated to avoid calling any portions of the standard library on the basis of a tier 3 target not implementing those portions.
This PR only adds the target. `std` support is being worked on and will be added in a future PR.
> The target must provide documentation for the Rust community explaining how to build for the target, using cross-compilation if possible. If the target supports running binaries, or running tests (even if they do not pass), the documentation must explain how to run such binaries or tests for the target, using emulation if possible or dedicated hardware if necessary.
👍
> Tier 3 targets must not impose burden on the authors of pull requests, or other developers in the community, to maintain the target. In particular, do not post comments (automated or manual) on a PR that derail or suggest a block on the PR based on a tier 3 target. Do not send automated messages or notifications (via any medium, including via ```@)``` to a PR author or others involved with a PR regarding a tier 3 target, unless they have opted into such messages.
👍
> Patches adding or updating tier 3 targets must not break any existing tier 2 or tier 1 target, and must not knowingly break another tier 3 target without approval of either the compiler team or the maintainers of the other tier 3 target.
👍
> Tier 3 targets must be able to produce assembly using at least one of rustc's supported backends from any host target. (Having support in a fork of the backend is not sufficient, it must be upstream.)
👍
Improve assembly test for CMSE ABIs
Tracking issues: #75835#81391
This ensures the code-gen for these ABIs does not change silently. There is a small chance that this code-gen might change, however even GCC (https://godbolt.org/z/16arxab5x and https://godbolt.org/z/16arxab5x) generates almost the same assembly for these ABIs. I hope the notes in the comments should help fix the tests if it ever breaks.
Use -0.0 in `intrinsics::simd::reduce_add_unordered`
-0.0 is the actual neutral additive float, not +0.0, and this matters to codegen.
try-job: aarch64-gnu
Add -Z small-data-threshold
This flag allows specifying the threshold size above which LLVM should not consider placing small objects in a `.sdata` or `.sbss` section.
Support is indicated in the target options via the
small-data-threshold-support target option, which can indicate either an
LLVM argument or an LLVM module flag. To avoid duplicate specifications
in a large number of targets, the default value for support is
DefaultForArch, which is translated to a concrete value according to the
target's architecture.
This flag allows specifying the threshold size above which LLVM should
not consider placing small objects in a .sdata or .sbss section.
Support is indicated in the target options via the
small-data-threshold-support target option, which can indicate either an
LLVM argument or an LLVM module flag. To avoid duplicate specifications
in a large number of targets, the default value for support is
DefaultForArch, which is translated to a concrete value according to the
target's architecture.
s390x: Fix a regression related to backchain feature
In #127506, we introduced a new IBM Z-specific target feature, `backchain`.
This particular `target-feature` was available as a function-level attribute in LLVM 17 and below, so some hacks were used to avoid blowing up LLVM when querying the supported LLVM features.
This led to an unfortunate regression where `cfg!(target-feature = "backchain")` will always return true.
This pull request aims to fix this issue, and a test has been introduced to ensure it will never happen again.
Fixes#129927.
r? `@RalfJung`
[testsuite][cleanup] Remove all usages of `dont_merge` hack to avoid function merging
Resolves#129438
The `-Zmerge-functions=disabled` compile flag exists for this purpose.
add `aarch64_unknown_nto_qnx700` target - QNX 7.0 support for aarch64le
This backports the QNX 7.1 aarch64 implementation to 7.0.
* [x] required `-lregex` disabled, see https://github.com/rust-lang/libc/pull/3775 (released in libc 0.2.156)
* [x] uses `libgcc.a` instead of `libgcc_s.so` (7.0 used ancient GCC 5.4 which didn't have gcc_s)
* [x] a fix in `backtrace` crate to support stack traces https://github.com/rust-lang/backtrace-rs/pull/648
This PR bumps libc dependency to 0.2.158
CC: to the folks who did the [initial implementation](https://doc.rust-lang.org/rustc/platform-support/nto-qnx.html): `@flba-eb,` `@gh-tr,` `@jonathanpallant,` `@japaric`
# Compile target
```bash
# Configure qcc build environment
source _path_/_to_/qnx7.0/qnxsdp-env.sh
# Tell rust to use qcc when building QNX 7.0 targets
export build_env='
CC_aarch64-unknown-nto-qnx700=qcc
CFLAGS_aarch64-unknown-nto-qnx700=-Vgcc_ntoaarch64le_cxx
CXX_aarch64-unknown-nto-qnx700=qcc
AR_aarch64_unknown_nto_qnx700=ntoaarch64-ar'
# Build rust compiler, libs, and the remote test server
env $build_env ./x.py build \
--target x86_64-unknown-linux-gnu,aarch64-unknown-nto-qnx700 \
rustc library/core library/alloc library/std src/tools/remote-test-server
rustup toolchain link stage1 build/host/stage1
```
# Compile "hello world"
```bash
source _path_/_to_/qnx7.0/qnxsdp-env.sh
cargo new hello_world
cd hello_world
cargo +stage1 build --release --target aarch64-unknown-nto-qnx700
```
# Configure a remote for testing
Do this from a new shell - we will need to run more commands in the previous one. I ran into these two issues, and found some workarounds.
* Temporary dir might not work properly
* Default `remote-test-server` has issues binding to an address
```
# ./remote-test-server
starting test server
thread 'main' panicked at src/tools/remote-test-server/src/main.rs:175:29:
called `Result::unwrap()` on an `Err` value: Os { code: 249, kind: AddrNotAvailable, message: "Can't assign requested address" }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
```
Specifying `--bind` param actually fixes that, and so does setting `TMPDIR` properly.
```bash
# Copy remote-test-server to remote device. You may need to use sftp instead.
# ATTENTION: Note that the path is different from the one in the remote testing documentation for some reason
scp ./build/x86_64-unknown-linux-gnu/stage1-tools-bin/remote-test-server qnxdevice:/path/
# Run ssh with port forwarding - so that rust tester can connect to the local port instead
ssh -L 12345:127.0.0.1:12345 qnxdevice
# on the device, run
rm -rf tmp && mkdir -p tmp && TMPDIR=$PWD/tmp ./remote-test-server --bind 0.0.0.0:12345
```
# Run test suit
Assume all previous environment variables are still set, or re-init them
```bash
export TEST_DEVICE_ADDR="localhost:12345"
# tidy needs to be skipped due to using un-published libc dependency
export exclude_tests='
--exclude src/bootstrap
--exclude src/tools/error_index_generator
--exclude src/tools/linkchecker
--exclude src/tools/tidy
--exclude tests/ui-fulldeps
--exclude rustc
--exclude rustdoc
--exclude tests/run-make-fulldeps'
env $build_env ./x.py test $exclude_tests --stage 1 --target aarch64-unknown-nto-qnx700
```
try-job: dist-x86_64-msvc
Add `f16` and `f128` inline ASM support for `aarch64`
Adds `f16` and `f128` inline ASM support for `aarch64`. SIMD vector types are taken from [the ARM intrinsics list](https://developer.arm.com/architectures/instruction-sets/intrinsics/#f:`@navigationhierarchiesreturnbasetype=[float]&f:@navigationhierarchieselementbitsize=[16]&f:@navigationhierarchiesarchitectures=[A64]).` Based on the work of `@lengrongfu` in #127043.
Relevant issue: #125398
Tracking issue: #116909
`@rustbot` label +F-f16_and_f128
try-job: aarch64-gnu
try-job: aarch64-apple
Add powerpc-unknown-linux-muslspe compile target
This is almost identical to already existing targets:
- powerpc_unknown_linux_musl.rs
- powerpc_unknown_linux_gnuspe.rs
It has support for PowerPC SPE (muslspe), which
can be used with GCC version up to 8. It is useful for Freescale or IBM cores like e500.
This was verified to be working with OpenWrt build system for CZ.NIC's Turris 1.x routers, which are using Freescale P2020, e500v2, so add it as a Tier 3 target.
Follow-up of https://github.com/rust-lang/rust/pull/100860
Refactor `powerpc64` call ABI handling
As the [specification](https://openpowerfoundation.org/specifications/64bitelfabi/) for the ELFv2 ABI states that returned aggregates are returned like arguments as long as they are at most two doublewords, I've merged the `classify_arg` and `classify_ret` functions to reduce code duplication. The only functional change is to fix#128579: the `classify_ret` function was incorrectly handling aggregates where `bits > 64 && bits < 128`. I've used the aggregate handling implementation from `classify_arg` which doesn't have this issue.
`@awilfox` could you test this on `powerpc64-unknown-linux-musl`? I'm only able to cross-test on `powerpc64-unknown-linux-gnu` and `powerpc64le-unknown-linux-gnu` locally at the moment, and as a tier 3 target `powerpc64-unknown-linux-musl` has zero CI coverage.
Fixes: #128579
Let InstCombine remove Clone shims inside Clone shims
The Clone shims that we generate tend to recurse into other Clone shims, which gets very silly very quickly. Here's our current state: https://godbolt.org/z/E69YeY8eq
So I've added InstSimplify to the shims optimization passes, and improved `is_trivially_pure_clone_copy` so that it can delete those calls inside the shim. This makes the shim way smaller because most of its size is the required ceremony for unwinding.
This change also completely breaks the UI test added for https://github.com/rust-lang/rust/issues/104870. With this PR, that program ICEs in MIR type checking because `is_trivially_pure_clone_copy` and the trait solver disagree on whether `*mut u8` is `Copy`. And adding the requisite `Copy` impl to make them agree makes the test not generate any diagnostics. Considering that I spent most of my time on this PR fixing `#![no_core]` tests, I would prefer to just delete this one. The maintenance burden of `#![no_core]` is uniquely high because when they break they tend to break in very confusing ways.
try-job: x86_64-mingw