Commit Graph

1057 Commits

Author SHA1 Message Date
Stuart Cook
6ebe590e41
Rollup merge of #135847 - edwloef:slice_ptr_rotate_opt, r=scottmcm
optimize slice::ptr_rotate for small rotates

r? `@scottmcm`

This swaps the positions and numberings of algorithms 1 and 2 in `slice::ptr_rotate`, and pulls the entire outer loop into algorithm 3 since it was redundant for the first two. Effectively, `ptr_rotate` now always does the `memcpy`+`memmove`+`memcpy` sequence if the shifts fit into the stack buffer.
With this change, an `IndexMap`-style `move_index` function is optimized correctly.

Assembly comparisons:
- `move_index`, before: https://godbolt.org/z/Kr616KnYM
- `move_index`, after: https://godbolt.org/z/1aoov6j8h
- the code from `#89714`, before: https://godbolt.org/z/Y4zaPxEG6
- the code from `#89714`, after: https://godbolt.org/z/1dPx83axc

related to #89714
some relevant discussion in https://internals.rust-lang.org/t/idea-shift-move-to-efficiently-move-elements-in-a-vec/22184

Behavior tests pass locally. I can't get any consistent microbenchmark results on my machine, but the assembly diffs look promising.
2025-01-30 14:25:04 +11:00
edwloef
fb3d1d0c4b
add inline attribute and codegen test 2025-01-29 19:34:19 +01:00
bors
a1d7676d6a Auto merge of #136227 - fmease:rollup-ewpvznh, r=fmease
Rollup of 9 pull requests

Successful merges:

 - #136121 (Deduplicate operand creation between scalars, non-scalars and string patterns)
 - #136134 (Fix SIMD codegen tests on LLVM 20)
 - #136153 (Locate asan-odr-win with other sanitizer tests)
 - #136161 (rustdoc: add nobuild typescript checking to our JS)
 - #136166 (interpret: is_alloc_live: check global allocs last)
 - #136168 (GCI: Don't try to eval / collect mono items inside overly generic free const items)
 - #136170 (Reject unsound toggling of Arm atomics-32 target feature)
 - #136176 (Render pattern types nicely in mir dumps)
 - #136186 (uefi: process: Fix args)

r? `@ghost`
`@rustbot` modify labels: rollup
2025-01-29 11:27:18 +00:00
León Orell Valerian Liehr
01f1dab0e9
Rollup merge of #136134 - nikic:llvm-20-simd-tests, r=Urgau
Fix SIMD codegen tests on LLVM 20

The splat constants are printed differently on LLVM 20.
2025-01-29 06:03:21 +01:00
León Orell Valerian Liehr
7e123e4940
Rollup merge of #136147 - RalfJung:required-target-features-check-not-add, r=workingjubilee
ABI-required target features: warn when they are missing in base CPU

Part of https://github.com/rust-lang/rust/pull/135408:
instead of adding ABI-required features to the target we build for LLVM, check that they are already there. Crucially we check this after applying `-Ctarget-cpu` and `-Ctarget-feature`, by reading `sess.unstable_target_features`. This means we can tweak the ABI target feature check without changing the behavior for any existing user; they will get warnings but the target features behave as before.

The test changes here show that we are un-doing the "add all required target features" part. Without the full #135408, there is no way to take a way an ABI-required target feature with `-Ctarget-cpu`, so we cannot yet test that part.

Cc ``@workingjubilee``
2025-01-29 03:12:21 +01:00
Taiki Endo
93465e6c31 Mark condition/carry bit as clobbered in C-SKY inline assembly 2025-01-29 06:46:05 +09:00
Taiki Endo
e586382feb Support clobber_abi in BPF inline assembly 2025-01-29 02:14:25 +09:00
Alisa Sireneva
efaeedef59 Fix tests/codegen/wasm_exceptions 2025-01-28 19:10:26 +03:00
Alisa Sireneva
5df51930f9 Fix tests/codegen/float/f128 2025-01-28 19:10:24 +03:00
bors
66d6064f9e Auto merge of #134290 - tgross35:windows-i128-callconv, r=bjorn3,wesleywiser
Windows x86: Change i128 to return via the vector ABI

Clang and GCC both return `i128` in xmm0 on windows-msvc and windows-gnu. Currently, Rust returns the type on the stack. Add a calling convention adjustment so we also return scalar `i128`s using the vector ABI, which makes our `i128` compatible with C.

In the future, Clang may change to return `i128` on the stack for its `-msvc` targets (more at [1]). If this happens, the change here will need to be adjusted to only affect MinGW.

Link: https://github.com/rust-lang/rust/issues/134288 (does not fix) [1]

try-job: x86_64-msvc
try-job: x86_64-msvc-ext1
try-job: x86_64-mingw-1
try-job: x86_64-mingw-2
2025-01-28 06:11:13 +00:00
Ralf Jung
93ee180cfa ABI-required target features: warn when they are missing in base CPU (rather than silently enabling them) 2025-01-28 04:40:42 +01:00
Caleb Zulawski
44b2e6c07d Stabilize target_feature_11 2025-01-27 23:44:47 +01:00
Nikita Popov
f895e31d59 Fix SIMD codegen tests on LLVM 20
The splat contents are printed differently on LLVM 20.
2025-01-27 15:11:59 +01:00
Trevor Gross
a44a20ee4a Windows x86: Change i128 to return via the vector ABI
Clang and GCC both return `i128` in xmm0 on windows-msvc and
windows-gnu. Currently, Rust returns the type on the stack. Add a
calling convention adjustment so we also return scalar `i128`s using the
vector ABI, which makes our `i128` compatible with C.

In the future, Clang may change to return `i128` on the stack for its
`-msvc` targets (more at [1]). If this happens, the change here will
need to be adjusted to only affect MinGW.

Link: https://github.com/rust-lang/rust/issues/134288
2025-01-27 12:12:59 +00:00
Trevor Gross
581e0ac90c Introduce a test for the i128 calling convention on Windows
Currently we both pass and return `i128` indirectly on Windows for MSVC
and MinGW, but this will be adjusted. Introduce a test verifying the
current state.
2025-01-27 12:12:59 +00:00
Jörn Horstmann
3779b8e32e Consistently use the most significant bit of vector masks
This improves the codegen for vector `select`, `gather`, `scatter` and
boolean reduction intrinsics and fixes rust-lang/portable-simd#316.

The current behavior of most mask operations during llvm codegen is to
truncate the mask vector to <N x i1>, telling llvm to use the least
significat bit. The exception is the `simd_bitmask` intrinsics, which
already used the most signifiant bit.

Since sse/avx instructions are defined to use the most significant bit,
truncating means that llvm has to insert a left shift to move the bit
into the most significant position, before the mask can actually be
used.

Similarly on aarch64, mask operations like blend work bit by bit,
repeating the least significant bit across the whole lane involves
shifting it into the sign position and then comparing against zero.

By shifting before truncating to <N x i1>, we tell llvm that we only
consider the most significant bit, removing the need for additional
shift instructions in the assembly.
2025-01-26 16:44:23 +01:00
Joshua Wong
97005678c3 reduce Box::default stack copies in debug mode
The `Box::new(T::default())` implementation of `Box::default` only
had two stack copies in debug mode, compared to the current version,
which has four. By avoiding creating any `MaybeUninit<T>`'s and just writing
`T` directly to the `Box` pointer, the stack usage in debug mode remains
the same as the old version.
2025-01-26 03:48:27 -05:00
Joshua Wong
80faf20351 add test for Box::default's stack usage in debug mode 2025-01-26 03:48:27 -05:00
Jacob Pratt
61e572b3f6
Rollup merge of #135785 - folkertdev:s390x-vector-passmode-direct, r=bjorn3
use `PassMode::Direct` for vector types on `s390x`

closes https://github.com/rust-lang/rust/issues/135744
tracking issue: https://github.com/rust-lang/rust/issues/130869

Previously, all vector types were type erased to `Ni8`, now we pass non-wrapped vector types directly. That skips emitting a bunch of casting logic in rustc, that LLVM then has to clean up. The initial LLVM IR is also a bit more readable.

This calling convention is tested extensively in `tests/assembly/s390x-vector-abi.rs`, showing that this change has no impact on the ABI in practice.

r? ````@taiki-e````
2025-01-25 23:26:59 -05:00
clubby789
cd848c9f3e Implement optimize(none) attribute 2025-01-23 17:19:53 +00:00
bors
b2728d5426 Auto merge of #135674 - scottmcm:assume-better, r=estebank
Update our range `assume`s to the format that LLVM prefers

I found out in https://github.com/llvm/llvm-project/issues/123278#issuecomment-2597440158 that the way I started emitting the `assume`s in #109993 was suboptimal, and as seen in that LLVM issue the way we're doing it -- with two `assume`s sometimes -- can at times lead to CVP/SCCP not realize what's happening because one of them turns into a `ne` instead of conveying a range.

So this updates how it's emitted from
```
assume( x >= LOW );
assume( x <= HIGH );
```
or
```
// (for ranges that wrap the range)
assume( (x <= LOW) | (x >= HIGH) );
```
to
```
assume( (x - LOW) <= (HIGH - LOW) );
```
so that we don't need multiple `icmp`s nor multiple `assume`s for a single value, and both wrappping and non-wrapping ranges emit the same shape.

(And we don't bother emitting the subtraction if `LOW` is zero, since that's trivial for us to check too.)
2025-01-22 04:18:30 +00:00
bors
ed43cbcb88 Auto merge of #134299 - RalfJung:remove-start, r=compiler-errors
remove support for the (unstable) #[start] attribute

As explained by `@Noratrieb:`
`#[start]` should be deleted. It's nothing but an accidentally leaked implementation detail that's a not very useful mix between "portable" entrypoint logic and bad abstraction.

I think the way the stable user-facing entrypoint should work (and works today on stable) is pretty simple:
- `std`-using cross-platform programs should use `fn main()`. the compiler, together with `std`, will then ensure that code ends up at `main` (by having a platform-specific entrypoint that gets directed through `lang_start` in `std` to `main` - but that's just an implementation detail)
- `no_std` platform-specific programs should use `#![no_main]` and define their own platform-specific entrypoint symbol with `#[no_mangle]`, like `main`, `_start`, `WinMain` or `my_embedded_platform_wants_to_start_here`. most of them only support a single platform anyways, and need cfg for the different platform's ways of passing arguments or other things *anyways*

`#[start]` is in a super weird position of being neither of those two. It tries to pretend that it's cross-platform, but its signature is  a total lie. Those arguments are just stubbed out to zero on ~~Windows~~ wasm, for example. It also only handles the platform-specific entrypoints for a few platforms that are supported by `std`, like Windows or Unix-likes. `my_embedded_platform_wants_to_start_here` can't use it, and neither could a libc-less Linux program.
So we have an attribute that only works in some cases anyways, that has a signature that's a total lie (and a signature that, as I might want to add, has changed recently, and that I definitely would not be comfortable giving *any* stability guarantees on), and where there's a pretty easy way to get things working without it in the first place.

Note that this feature has **not** been RFCed in the first place.

*This comment was posted [in May](https://github.com/rust-lang/rust/issues/29633#issuecomment-2088596042) and so far nobody spoke up in that issue with a usecase that would require keeping the attribute.*

Closes https://github.com/rust-lang/rust/issues/29633

try-job: x86_64-gnu-nopt
try-job: x86_64-msvc-1
try-job: x86_64-msvc-2
try-job: test-various
2025-01-21 19:46:20 +00:00
Ralf Jung
56c90dc31e remove support for the #[start] attribute 2025-01-21 06:59:15 -07:00
Oli Scherer
8f5f5e56a8 Add more tests 2025-01-21 08:27:30 +00:00
Oli Scherer
dfa4c01b2e Treat undef bytes as equal to any other byte 2025-01-21 08:27:21 +00:00
Oli Scherer
964c58a7d9 Ensure we always get a constant, even without mir opts 2025-01-21 08:25:59 +00:00
Oli Scherer
8876cf7181 Also generate undef scalars and scalar pairs 2025-01-21 08:22:15 +00:00
Folkert de Vries
893d81f1e2
on s390x, use PassMode::Direct for vector types 2025-01-20 21:02:21 +01:00
Matthias Krüger
bbec1510bb
Rollup merge of #133695 - x17jiri:hint_likely, r=Amanieu
Reexport likely/unlikely in std::hint

Since `likely`/`unlikely` should be working now, we could reexport them in `std::hint`. I'm not sure if this is already approved or if it requires approval

Tracking issue: #26179
2025-01-20 20:58:34 +01:00
Scott McMurray
6fe82006a4 Update our range assumes to the format that LLVM prefers 2025-01-17 20:39:38 -08:00
bors
bcd0683e5d Auto merge of #135534 - folkertdev:fix-wasm-i128-f128, r=tgross35
use indirect return for `i128` and `f128` on wasm32

fixes #135532

Based on https://github.com/WebAssembly/tool-conventions/blob/main/BasicCABI.md we now use an indirect return for  `i128`, `u128` and `f128`. That is what LLVM ended up doing anyway.

r? `@bjorn3`
2025-01-17 15:07:28 +00:00
bors
0c2c096e1a Auto merge of #135047 - Flakebi:amdgpu-kernel-cc, r=workingjubilee
Add gpu-kernel calling convention

The amdgpu-kernel calling convention was reverted in commit f6b21e90d1 (#120495 and https://github.com/rust-lang/rust-analyzer/pull/16463) due to inactivity in the amdgpu target.

Introduce a `gpu-kernel` calling convention that translates to `ptx_kernel` or `amdgpu_kernel`, depending on the target that rust compiles for.

Tracking issue: #135467
amdgpu target tracking issue: #135024
2025-01-17 04:36:09 +00:00
Folkert de Vries
702134a930
use indirect return for i128 and f128 on wasm32 2025-01-16 13:25:40 +01:00
Flakebi
e7e5202978
Add gpu-kernel calling convention
The amdgpu-kernel calling convention was reverted in commit
f6b21e90d1 due to inactivity in the amdgpu
target.

Introduce a `gpu-kernel` calling convention that translates to
`ptx_kernel` or `amdgpu_kernel`, depending on the target that rust
compiles for.
2025-01-16 00:26:55 +01:00
Jiri Bobek
c656f879c9 Export likely(), unlikely() and cold_path() in std::hint 2025-01-15 21:42:47 +01:00
bors
7a202a9056 Auto merge of #135204 - RalfJung:win64-zst, r=SparrowLii
fix handling of ZST in win64 ABI on windows-msvc targets

The Microsoft calling conventions do not really say anything about ZST since they do not seem to exist in MSVC. However, both GCC and clang allow passing ZST over  `__attribute__((ms_abi))` functions (which matches our `extern "win64" fn`) on `windows-gnu` targets, and therefore implicitly define a de-facto ABI for these types (and lucky enough they seem to define the same ABI). This ABI should be the same for windows-msvc and windows-gnu targets, so we use this as a hint for how to implement this ABI everywhere: we always pass ZST by-ref.

The best alternative would be to just reject compiling functions which cannot exist in MSVC, but that would be a breaking change.

Cc `@programmerjake` `@ChrisDenton`
Fixes https://github.com/rust-lang/rust/issues/132893
2025-01-13 13:05:53 +00:00
Ralf Jung
675a1036ca on Windows, consistently pass ZST by-ref 2025-01-12 13:32:36 +01:00
Matthias Krüger
b8e230a824
Rollup merge of #134030 - folkertdev:min-fn-align, r=workingjubilee
add `-Zmin-function-alignment`

tracking issue: https://github.com/rust-lang/rust/issues/82232

This PR adds the `-Zmin-function-alignment=<align>` flag, that specifies a minimum alignment for all* functions.

### Motivation

This feature is requested by RfL [here](https://github.com/rust-lang/rust/issues/128830):

> i.e. the equivalents of `-fmin-function-alignment` ([GCC](https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-fmin-function-alignment_003dn), Clang does not support it) / `-falign-functions` ([GCC](https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-falign-functions), [Clang](https://clang.llvm.org/docs/ClangCommandLineReference.html#cmdoption-clang1-falign-functions)).
>
> For the Linux kernel, the behavior wanted is that of GCC's `-fmin-function-alignment` and Clang's `-falign-functions`, i.e. align all functions, including cold functions.
>
> There is [`feature(fn_align)`](https://github.com/rust-lang/rust/issues/82232), but we need to do it globally.

### Behavior

The `fn_align` feature does not have an RFC. It was decided at the time that it would not be necessary, but maybe we feel differently about that now? In any case, here are the semantics of this flag:

- `-Zmin-function-alignment=<align>` specifies the minimum alignment of all* functions
- the `#[repr(align(<align>))]` attribute can be used to override the function alignment on a per-function basis: when `-Zmin-function-alignment` is specified, the attribute's value is only used when it is higher than the value passed to `-Zmin-function-alignment`.
- the target may decide to use a higher value (e.g. on x86_64 the minimum that LLVM generates is 16)
- The highest supported alignment in rust is `2^29`: I checked a bunch of targets, and they all emit the `.p2align        29` directive for targets that align functions at all (some GPU stuff does not have function alignment).

*: Only with `build-std` would the minimum alignment also be applied to `std` functions.

---

cc `@ojeda`

r? `@workingjubilee` you were active on the tracking issue
2025-01-11 18:13:45 +01:00
Jacob Pratt
351e6188a8
Rollup merge of #135236 - scottmcm:more-mcp807-library-updates, r=ChrisDenton
Update a bunch of library types for MCP807

This greatly reduces the number of places that actually use the `rustc_layout_scalar_valid_range_*` attributes down to just 3:
```
library/core\src\ptr\non_null.rs
68:#[rustc_layout_scalar_valid_range_start(1)]

library/core\src\num\niche_types.rs
19:        #[rustc_layout_scalar_valid_range_start($low)]
20:        #[rustc_layout_scalar_valid_range_end($high)]
```

Everything else -- PAL Nanoseconds, alloc's `Cap`, niched FDs, etc -- all just wrap those `niche_types` types.

r? ghost
2025-01-11 01:55:05 -05:00
Folkert de Vries
47573bf61e
add -Zmin-function-alignment 2025-01-10 22:53:54 +01:00
Oli Scherer
65b01cb182 Use llvm.memset.p0i8.* to initialize all same-bytes arrays 2025-01-10 15:22:06 +00:00
Ralf Jung
d760bb6603 fix ZST handling for Windows ABIs on MSVC target 2025-01-10 12:16:49 +01:00
Oli Scherer
aec51564a5 Add regression test for option initialization 2025-01-10 08:27:41 +00:00
Scott McMurray
6f2a78345e Update a bunch of library types for MCP807
This greatly reduces the number of places that actually use the `rustc_layout_scalar_valid_range_*` attributes down to just 3:
```
library/core\src\ptr\non_null.rs
68:#[rustc_layout_scalar_valid_range_start(1)]

library/core\src\num\niche_types.rs
19:        #[rustc_layout_scalar_valid_range_start($low)]
20:        #[rustc_layout_scalar_valid_range_end($high)]
```

Everything else -- PAL Nanoseconds, alloc's `Cap`, niched FDs, etc -- all just wrap those `niche_types` types.
2025-01-09 23:47:11 -08:00
Jacob Pratt
4e4a93c2dd
Rollup merge of #131830 - hoodmane:emscripten-wasm-eh, r=workingjubilee
Add support for wasm exception handling to Emscripten target

This is a draft because we need some additional setting for the Emscripten target to select between the old exception handling and the new exception handling. I don't know how to add a setting like that, would appreciate advice from Rust folks. We could maybe choose to use the new exception handling if `Ctarget-feature=+exception-handling` is passed? I tried this but I get errors from llvm so I'm not doing it right.
2025-01-06 22:04:13 -05:00
bors
243d2ca4db Auto merge of #135112 - tgross35:combine-select-unpredictable-test, r=the8472
Merge the intrinsic and user tests for `select_unpredictable`

[1] mentions that having a single test with `-Zmerge-functions=disabled` is preferable to having two separate tests.  Apply that to the new `select_unpredictable` test here.

[1]: https://github.com/rust-lang/rust/pull/133964#issuecomment-2569693325
2025-01-06 10:52:07 +00:00
Hood Chatham
49c74234a7 Add support for wasm exception handling to Emscripten target
Gated behind an unstable `-Z emscripten-wasm-eh` flag
2025-01-06 10:29:54 +01:00
bors
feb32c6546 Auto merge of #134794 - RalfJung:abi-required-target-features, r=workingjubilee
Add a notion of "some ABIs require certain target features"

I think I finally found the right shape for the data and checks that I recently added in https://github.com/rust-lang/rust/pull/133099, https://github.com/rust-lang/rust/pull/133417, https://github.com/rust-lang/rust/pull/134337: we have a notion of "this ABI requires the following list of target features, and it is incompatible with the following list of target features". Both `-Ctarget-feature` and `#[target_feature]` are updated to ensure we follow the rules of the ABI.  This removes all the "toggleability" stuff introduced before, though we do keep the notion of a fully "forbidden" target feature -- this is needed to deal with target features that are actual ABI switches, and hence are needed to even compute the list of required target features.

We always explicitly (un)set all required and in-conflict features, just to avoid potential trouble caused by the default features of whatever the base CPU is. We do this *before* applying `-Ctarget-feature` to maintain backward compatibility; this poses a slight risk of missing some implicit feature dependencies in LLVM but has the advantage of not breaking users that deliberately toggle ABI-relevant target features. They get a warning but the feature does get toggled the way they requested.

For now, our logic supports x86, ARM, and RISC-V (just like the previous logic did). Unsurprisingly, RISC-V is the nicest. ;)

As a side-effect this also (unstably) allows *enabling* `x87` when that is harmless. I used the opportunity to mark SSE2 as required on x86-64, to better match the actual logic in LLVM and because all x86-64 chips do have SSE2. This infrastructure also prepares us for requiring SSE on x86-32 when we want to use that for our ABI (and for float semantics sanity), see https://github.com/rust-lang/rust/issues/133611, but no such change is happening in this PR.

r? `@workingjubilee`
2025-01-05 23:21:06 +00:00
Trevor Gross
74d2d4bfa4 Expand the select_unpredictable test for ZSTs
For ZSTs there is no selection that needs to take place, so assert that
no `select` statement is emitted.
2025-01-05 08:51:15 +00:00
Trevor Gross
d42c3ae02f Merge the intrinsic and user tests for select_unpredictable
[1] mentions that having a single test with `-Zmerge-functions=disabled`
is preferable to having two separate tests.  Apply that to the new
`select_unpredicatble` test here.

[1]: https://github.com/rust-lang/rust/pull/133964#issuecomment-2569693325
2025-01-05 01:17:07 +00:00
Matthias Krüger
75e412b8d1
Rollup merge of #135084 - maurer:nuw, r=nikic
Update carrying_mul_add test to tolerate `nuw`

LLVM 20 adds nuw to GEP operations in this code, tolerate them.

`@rustbot` label: +llvm-main

r? `@durin42`
2025-01-04 09:54:40 +01:00
Matthias Krüger
695da5b782
Rollup merge of #133964 - joboet:select_unpredictable, r=tgross35
core: implement `bool::select_unpredictable`

Tracking issue: #133962
ACP: https://github.com/rust-lang/libs-team/issues/468
2025-01-04 09:54:36 +01:00
Matthew Maurer
ed005245c6 Update carrying_mul_add test to tolerate nuw
LLVM 20 adds nuw to GEP operations in this code, tolerate them.
2025-01-03 20:25:14 +00:00
joboet
8f3aa358bf
add codegen test for bool::select_unpredictable 2025-01-03 19:44:08 +01:00
Ralf Jung
43ede97ebf arm: use target.llvm_floatabi over soft-float target feature 2024-12-31 12:41:20 +01:00
Ralf Jung
912b7291d0 add ABI target features *before* -Ctarget-features 2024-12-31 12:41:20 +01:00
Ralf Jung
eb527424a5 x86-64 hardfloat actually requires sse2 2024-12-31 12:41:20 +01:00
Ralf Jung
2bf27e09be explicitly model that certain ABIs require/forbid certain target features 2024-12-31 12:41:20 +01:00
bors
4e5fec2f1e Auto merge of #134757 - RalfJung:const_swap, r=scottmcm
stabilize const_swap

libs-api FCP passed in https://github.com/rust-lang/rust/issues/83163.

However, I only just realized that this actually involves an intrinsic. The intrinsic could be implemented entirely with existing stable const functionality, but we choose to make it a primitive to be able to detect more UB. So nominating for `@rust-lang/lang`  to make sure they are aware; I leave it up to them whether they want to FCP this.

While at it I also renamed the intrinsic to make the "nonoverlapping" constraint more clear.

Fixes #83163
2024-12-30 23:46:42 +00:00
Matthias Krüger
6c12546dc0
Rollup merge of #134871 - clubby789:test-63646, r=compiler-errors
Add codegen test for issue 63646

Closes #63646
2024-12-30 19:34:55 +01:00
Alex Gaynor
dab1c57723 Added codegen test for elidings bounds check when indexes are manually checked
Closes #55147
2024-12-29 08:02:40 -06:00
clubby789
71e3ea35b1 Add codegen test for issue 63646 2024-12-29 03:31:37 +00:00
Alex Gaynor
d6c73ebbf3 Added a codegen test for optimization with const arrays
Closes #107208
2024-12-28 13:28:35 -06:00
Scott McMurray
4669c0d756 Override carrying_mul_add in cg_llvm 2024-12-27 08:17:40 -08:00
Ralf Jung
7291b1eaf7 rename typed_swap → typed_swap_nonoverlapping 2024-12-25 10:53:03 +01:00
bors
303e8bd768 Auto merge of #131193 - EFanZh:asserts-vec-len, r=the8472
Asserts the maximum value that can be returned from `Vec::len`

Currently, casting `Vec<i32>` to `Vec<u32>` takes O(1) time:

```rust
// See <https://godbolt.org/z/hxq3hnYKG> for assembly output.
pub fn cast(vec: Vec<i32>) -> Vec<u32> {
    vec.into_iter().map(|e| e as _).collect()
}
```

But the generated assembly is not the same as the identity function, which prevents us from casting `Vec<Vec<i32>>` to `Vec<Vec<u32>>` within O(1) time:

```rust
// See <https://godbolt.org/z/7n48bxd9f> for assembly output.
pub fn cast(vec: Vec<Vec<i32>>) -> Vec<Vec<u32>> {
    vec.into_iter()
        .map(|e| e.into_iter().map(|e| e as _).collect())
        .collect()
}
```

This change tries to fix the problem. You can see the comparison here: <https://godbolt.org/z/jdManrKvx>.
2024-12-22 16:09:16 +00:00
bors
c1132470a6 Auto merge of #130733 - okaneco:is_ascii, r=scottmcm
Optimize `is_ascii` for `str` and `[u8]` further

Replace the existing optimized function with one that enables auto-vectorization.

This is especially beneficial on x86-64 as `pmovmskb` can be emitted with careful structuring of the code. The instruction can detect non-ASCII characters one vector register width at a time instead of the current `usize` at a time check.

The resulting implementation is completely safe.

`case00_libcore` is the current implementation, `case04_while_loop` is this PR.
```
benchmarks:
    ascii::is_ascii_slice::long::case00_libcore                             22.25/iter  +/- 1.09
    ascii::is_ascii_slice::long::case04_while_loop                           6.78/iter  +/- 0.92
    ascii::is_ascii_slice::medium::case00_libcore                            2.81/iter  +/- 0.39
    ascii::is_ascii_slice::medium::case04_while_loop                         1.56/iter  +/- 0.78
    ascii::is_ascii_slice::short::case00_libcore                             5.55/iter  +/- 0.85
    ascii::is_ascii_slice::short::case04_while_loop                          3.75/iter  +/- 0.22
    ascii::is_ascii_slice::unaligned_both_long::case00_libcore              26.59/iter  +/- 0.66
    ascii::is_ascii_slice::unaligned_both_long::case04_while_loop            5.78/iter  +/- 0.16
    ascii::is_ascii_slice::unaligned_both_medium::case00_libcore             2.97/iter  +/- 0.32
    ascii::is_ascii_slice::unaligned_both_medium::case04_while_loop          2.41/iter  +/- 0.10
    ascii::is_ascii_slice::unaligned_head_long::case00_libcore              23.71/iter  +/- 0.79
    ascii::is_ascii_slice::unaligned_head_long::case04_while_loop            7.83/iter  +/- 1.31
    ascii::is_ascii_slice::unaligned_head_medium::case00_libcore             3.69/iter  +/- 0.54
    ascii::is_ascii_slice::unaligned_head_medium::case04_while_loop          7.05/iter  +/- 0.32
    ascii::is_ascii_slice::unaligned_tail_long::case00_libcore              24.44/iter  +/- 1.41
    ascii::is_ascii_slice::unaligned_tail_long::case04_while_loop            5.12/iter  +/- 0.18
    ascii::is_ascii_slice::unaligned_tail_medium::case00_libcore             3.24/iter  +/- 0.40
    ascii::is_ascii_slice::unaligned_tail_medium::case04_while_loop          2.86/iter  +/- 0.14

```

`unaligned_head_medium` is the main regression in the benchmarks. It is a 32 byte string being sliced `bytes[1..]`.

The first commit can be used to run the benchmarks against the current core implementation.

Previous implementation was done in #74066

---

Two potential drawbacks of this implementation are that it increases instruction count and may regress other platforms/architectures. The benches here may also be too artificial to glean much insight from.
https://rust.godbolt.org/z/G9znGfY36
2024-12-22 02:44:13 +00:00
Taiki Endo
96edf41194 tests/codegen/asm: Remove uses of rustc_attrs and lang_items features by using minicore 2024-12-20 23:19:12 +09:00
许杰友 Jieyou Xu (Joe)
5415f067bd Explicitly register MSVC/NONMSVC revisions for some codegen tests 2024-12-19 20:36:51 +08:00
许杰友 Jieyou Xu (Joe)
aaca9fa482 compiletest: don't register MSVC/NONMSVC FileCheck prefixes
This was fragile as it was based on host target passed to compiletest,
but the user could cross-compile and run test for a different target
(e.g. cross from linux to msvc, but msvc won't be set on the target).
Furthermore, it was also very surprising as normally revision names
(other than `CHECK`) was accepted as FileCheck prefixes.
2024-12-19 20:36:51 +08:00
Josh Triplett
a105cd6066 Use field init shorthand where possible
Field init shorthand allows writing initializers like `tcx: tcx` as
`tcx`. The compiler already uses it extensively. Fix the last few places
where it isn't yet used.
2024-12-17 14:33:10 -08:00
DianQK
3fc506b4d4
Simplify the GEP instruction for index 2024-12-15 19:01:45 +08:00
EFanZh
7d450bbf31 Fix vec_pop_push_noop codegen test on wasm32-wasip1 target 2024-12-15 15:44:56 +08:00
EFanZh
b5ea631fbd Asserts the maximum value that can be returned from Vec::len 2024-12-15 15:44:56 +08:00
bors
dd436ae2a6 Auto merge of #133899 - scottmcm:strip-mir-debuginfo, r=oli-obk
We don't need `NonNull::as_ptr` debuginfo

In order to stop pessimizing the use of local variables in core, skip debug info for MIR temporaries in tiny (single-BB) functions.

For functions as simple as this -- `Pin::new`, etc -- nobody every actually wants debuginfo for them in the first place.  They're more like intrinsics than real functions, and stepping over them is good.
2024-12-13 08:32:20 +00:00
Michael Goulet
c605c84be8 Stabilize async closures 2024-12-13 00:04:56 +00:00
bors
1daec069fb Auto merge of #128004 - folkertdev:naked-fn-asm, r=Amanieu
codegen `#[naked]` functions using global asm

tracking issue: https://github.com/rust-lang/rust/issues/90957

Fixes #124375

This implements the approach suggested in the tracking issue: use the existing global assembly infrastructure to emit the body of `#[naked]` functions. The main advantage is that we now have full control over what gets generated, and are no longer dependent on LLVM not sneakily messing with our output (inlining, adding extra instructions, etc).

I discussed this approach with `@Amanieu` and while I think the general direction is correct, there is probably a bunch of stuff that needs to change or move around here. I'll leave some inline comments on things that I'm not sure about.

Combined with https://github.com/rust-lang/rust/pull/127853, if both accepted, I think that resolves all steps from the tracking issue.

r? `@Amanieu`
2024-12-11 21:51:07 +00:00
Zalathar
9e6b7c17c8 coverage: Adjust a codegen test to ignore the order of covmap/covfun globals 2024-12-11 21:34:48 +11:00
Folkert de Vries
4202c1ea75
make naked function generics test stricter 2024-12-10 21:41:05 +01:00
Folkert de Vries
69a0c64e2b
fix the naked-asan test
we get these declarations

```
; opt level 0
declare x86_intrcc void @page_fault_handler(ptr byval([8 x i8]) align 8, i64) unnamed_addr #1
; opt level > 0
declare x86_intrcc void @page_fault_handler(ptr noalias nocapture noundef byval([8 x i8]) align 8 dereferenceable(8), i64 noundef) unnamed_addr #1
```

The space after `i64` in the original regex made the regex not match for
opt level 0. Removing the space fixes the issue.

```
declare x86_intrcc void @page_fault_handler(ptr {{.*}}, i64 {{.*}}){{.*}}#[[ATTRS:[0-9]+]]
```
2024-12-10 21:41:05 +01:00
Folkert
bd8f8e0631
codegen #[naked] functions using global_asm! 2024-12-10 21:41:03 +01:00
Scott McMurray
a7fc76a3ab We don't need NonNull::as_ptr debuginfo
Stop pessimizing the use of local variables in core by skipping debug info for MIR temporaries in tiny (single-BB) functions.

For functions as simple as this -- `Pin::new`, etc -- nobody every actually wants debuginfo for them in the first place.  They're more like intrinsics than real functions, and stepping over them is good.
2024-12-10 01:29:43 -08:00
Matthias Krüger
820ddaf67a
Rollup merge of #130777 - azhogin:azhogin/reg-struct-return, r=workingjubilee
rust_for_linux: -Zreg-struct-return commandline flag for X86 (#116973)

Command line flag `-Zreg-struct-return` for X86 (32-bit) for rust-for-linux.
This flag enables the same behavior as the `abi_return_struct_as_int` target spec key.

- Tracking issue: https://github.com/rust-lang/rust/issues/116973
2024-12-06 09:27:38 +01:00
Tim Neumann
8f0ea9a7be Adapt codegen tests for NUW inference 2024-12-05 16:08:41 +01:00
Oli Scherer
f613636ae8 Rename core_pattern_type and core_pattern_types lib feature gates to pattern_type_macro
That's what the gates are actually gating, and the single char difference in naming was not helpful either
2024-12-04 16:16:24 +00:00
Matthias Krüger
c179a15f7a
Rollup merge of #132612 - compiler-errors:async-trait-bounds, r=lcnr
Gate async fn trait bound modifier on `async_trait_bounds`

This PR moves `async Fn()` trait bounds into a new feature gate: `feature(async_trait_bounds)`. The general vibe is that we will most likely stabilize the `feature(async_closure)` *without* the `async Fn()` trait bound modifier, so we need to gate that separately.

We're trying to work on the general vision of `async` trait bound modifier general in: https://github.com/rust-lang/rfcs/pull/3710, however that RFC still needs more time for consensus to converge, and we've decided that the value that users get from calling the bound `async Fn()` is *not really* worth blocking landing async closures in general.
2024-12-03 17:27:05 +01:00
bors
8575f8f91b Auto merge of #104342 - mweber15:add_file_location_to_more_types, r=wesleywiser
Require `type_map::stub` callers to supply file information

This change attaches file information (`DIFile` reference and line number) to struct debug info nodes.

Before:

```
; foo.ll
...
!5 = !DIFile(filename: "<unknown>", directory: "")
...
!16 = !DICompositeType(tag: DW_TAG_structure_type, name: "MyType", scope: !2, file: !5, size: 32, align: 32, elements: !17, templateParams: !19, identifier: "4cb373851db92e732c4cb5651b886dd0")
...
```

After:

```
; foo.ll
...
!3 = !DIFile(filename: "foo.rs", directory: "/home/matt/src/rust98678", checksumkind: CSK_SHA1, checksum: "bcb9f08512c8f3b8181ef4726012bc6807bc9be4")
...
!16 = !DICompositeType(tag: DW_TAG_structure_type, name: "MyType", scope: !2, file: !3, line: 3, size: 32, align: 32, elements: !17, templateParams: !19, identifier: "9e5968c7af39c148acb253912b7f409f")
...
```

Fixes #98678

r? `@wesleywiser`
2024-12-03 12:49:57 +00:00
Matt Weber
e9fbb6f271 Fix tests when using MinGW 2024-12-02 21:59:34 -05:00
Michael Goulet
59e3e8934e Gate async fn trait bound modifier on async_trait_bounds 2024-12-02 16:50:44 +00:00
Andrew Zhogin
9aab517d63 rust_for_linux: -Zreg-struct-return commandline flag for X86 (#116973) 2024-12-02 01:14:40 +07:00
许杰友 Jieyou Xu (Joe)
1aa01927d3
Rollup merge of #131551 - taiki-e:ppc-asm-vreg-inout, r=Amanieu
Support input/output in vector registers of PowerPC inline assembly

This extends currently clobber-only vector registers (`vreg`) support to allow passing `#[repr(simd)]` types as input/output.

| Architecture | Register class | Target feature | Allowed types |
| ------------ | -------------- | -------------- | -------------- |
| PowerPC      | `vreg` | `altivec` | `i8x16`, `i16x8`, `i32x4`, `f32x4` |
| PowerPC      | `vreg` | `vsx` | `f32`, `f64`, `i64x2`, `f64x2` |

In addition to floats and `core::simd` types listed above, `core::arch` types and custom `#[repr(simd)]` types of the same size and type are also allowed. All allowed types and relevant target features are currently unstable.

r? `@Amanieu`

`@rustbot` label +O-PowerPC +A-inline-assembly
2024-11-30 12:57:32 +08:00
Matthias Krüger
6c9e922685
Rollup merge of #131323 - jfrimmel:avr-inline-asm-clobber-abi, r=Amanieu
Support `clobber_abi` in AVR inline assembly

This PR implements the `clobber_abi` part necessary to eventually stabilize the inline assembly for AVR. This is tracked in #93335.
This is heavily inspired by the sibling-PR #131310 for the MSP430. I've explained my reasoning in the first commit message in detail, which is reproduced below for easier reviewing:

This follows the [ABI documentation] of AVR-GCC:

> The [...] call-clobbered general purpose registers (GPRs) are registers that might be destroyed (clobbered) by a function call.
>
> - **R18–R27, R30, R31**
>
>   These GPRs are call clobbered. An ordinary function may use them without restoring the contents. [...]
>
> - **R0, T-Flag**
>
>   The temporary register and the T-flag in SREG are also call-clobbered, but this knowledge is not exposed explicitly to the compiler (R0 is a fixed register).

Therefore this commit lists the aforementioned registers `r18–r27`, `r30` and `r31` as clobbered registers. Since the `r0` register (listed above as well) is not available in inline assembly at all (potentially because the AVR-GCC considers it a fixed register causing the register to never be used in register allocation and LLVM adopting this), there is no need to list it in the clobber list (the `r0`-variant is not even available). A comment was added to ensure, that the `r0` gets added to the clobber-list once the register gets usable in inline ASM.
Since the SREG is normally considered clobbered anyways (unless the user supplies the `preserve_flags`-option), there is no need to explicitly list a bit in this register (which is not possible to list anyways).

Note, that this commit completely ignores the case of interrupts (that are described in the ABI-specification), since every register touched in an ISR need to be saved anyways.

[ABI documentation]: https://gcc.gnu.org/wiki/avr-gcc#Call-Used_Registers

r? ``@Amanieu``

``@rustbot`` label +O-AVR
2024-11-29 16:02:20 +01:00
bors
d53f0b1d8e Auto merge of #123244 - Mark-Simulacrum:share-inline-never-generics, r=saethlin
Enable -Zshare-generics for inline(never) functions

This avoids inlining cross-crate generic items when possible that are
already marked inline(never), implying that the author is not intending
for the function to be inlined by callers. As such, having a local copy
may make it easier for LLVM to optimize but mostly just adds to binary
bloat and codegen time. In practice our benchmarks indicate this is
indeed a win for larger compilations, where the extra cost in dynamic
linking to these symbols is diminished compared to the advantages in
fewer copies that need optimizing in each binary.

It might also make sense it expand this with other heuristics (e.g.,
`#[cold]`) in the future, but this seems like a good starting point.

FWIW, I expect that doing cleanup in where we make the decision
what should/shouldn't be shared is also a good idea. Way too
much code needed to be tweaked to check this. But I'm hoping
to leave that for a follow-up PR rather than blocking this on it.
2024-11-28 21:44:34 +00:00
bors
a2545fd6fc Auto merge of #133540 - ehuss:compiletest-proc-macro, r=jieyouxu
Compiletest: add proc-macro header

This adds a `proc-macro` header to simplify using proc-macros, and to reduce boilerplate. This header works similar to the `aux-build` header where you pass a path for a proc-macro to be built.

This allows the `force-host`, `no-prefer-dynamic` headers, and `crate_type` attribute to be removed. Additionally it uses `--extern` like `aux_crate` (allows implicit `extern crate` in 2018) and `--extern proc_macro` (to place in the prelude in 2018).

~~This also includes a secondary change which defaults the edition of proc-macros to 2024. This further reduces boilerplate (removing `extern crate proc_macro;`), and allows using modern Rust syntax. I was a little on the fence including this. I personally prefer it, but I can imagine it might be confusing to others.~~ EDIT: Removed

Some tests were changed so that when there is a chain of dependencies A→B→C, that the `@ proc-macro` is placed in `B` instead of `A` so that the `--extern` flag works correctly (previously it depended on `-L` to find `C`). I think this is better to make the dependencies more explicit. None of these tests looked like the were actually testing this behavior.

There is one test that had an unexplained output change: `tests/ui/macros/same-sequence-span.rs`. I do not know why it changed, but it didn't look like it was particularly important. Perhaps there was a normalization issue?

This is currently not compatible with the rustdoc `build-aux-docs` header. It can probably be fixed, I'm just not feeling motivated to do that right now.

### Implementation steps

- [x] Document this new behavior in rustc-dev-guide once we figure out the specifics. https://github.com/rust-lang/rustc-dev-guide/pull/2149
2024-11-28 19:00:58 +00:00
Mark Rousskov
4a216a25d1 Share inline(never) generics across crates
This reduces code sizes and better respects programmer intent when
marking inline(never). Previously such a marking was essentially ignored
for generic functions, as we'd still inline them in remote crates.
2024-11-28 13:43:05 -05:00
Taiki Endo
0f8ebba54a Support #[repr(simd)] types in input/output of PowerPC inline assembly 2024-11-29 00:24:36 +09:00
Julian Frimmel
2bd3bbb2e0
Move & rename test case to match naming of #132456 2024-11-28 16:12:04 +01:00
Julian Frimmel
d7e0a3eee0
Add test case for the clobber options 2024-11-28 16:12:03 +01:00
Guillaume Gomez
23bab15d73
Rollup merge of #133463 - taiki-e:aarch64-asm-x18, r=Amanieu
Fix handling of x18 in AArch64 inline assembly on ohos/trusty or with -Zfixed-x18

Currently AArch64 inline assembly allows using x18 on ohos/trusty or with -Zfixed-x18.

7db7489f9b/compiler/rustc_target/src/asm/aarch64.rs (L74-L76)

However, x18 is reserved in these environments and should not be allowed in the input/output operands of inline assemblies as it is in Android, Windows, etc..

7db7489f9b/compiler/rustc_target/src/spec/targets/aarch64_unknown_linux_ohos.rs (L19)
7db7489f9b/compiler/rustc_target/src/spec/targets/aarch64_unknown_trusty.rs (L18)
7db7489f9b/compiler/rustc_codegen_llvm/src/llvm_util.rs (L764-L771)

(As for ohos, +reserve-x18 is [redundant](c417b7a695 (diff-0ddf23e0bf2b28b2d05f842f087d1e6f694e8e06d1765e8d0f10d47fddcdff9c)) since 7a966b9188 that starting using llvm's ohos targets. So removed it from target-spec.)

This fix may potentially break the code for tier 2 target (aarch64-unknown-linux-ohos). (As for others, aarch64-unknown-trusty is tier 3 and -Zfixed-x18 is unstable so breaking them should be fine.)
However, in any case, it seems suspicious that the code that is broken by this was sound.

r? `@Amanieu`

`@rustbot` label O-AArch64 +A-inline-assembly
2024-11-28 12:06:02 +01:00
Guillaume Gomez
470c4f94e8
Rollup merge of #133452 - taiki-e:hexagon-asm-pred, r=Amanieu
Support predicate registers (clobber-only) in Hexagon inline assembly

The result of the Hexagon instructions such as comparison, store conditional, etc. is stored in predicate registers (`p[0-3]`), but currently there is no way to mark it as clobbered in `asm!`.

This is also needed for `clobber_abi` (although implementing `clobber_abi` will require the addition of support for [several more register classes](https://github.com/llvm/llvm-project/blob/llvmorg-19.1.0/llvm/lib/Target/Hexagon/HexagonRegisterInfo.cpp#L71-L90). see also https://github.com/rust-lang/rust/issues/93335#issuecomment-2395210055).

Refs:
- [Section 6 "Conditional Execution" in Qualcomm Hexagon V73 Programmer’s Reference Manual](https://docs.qualcomm.com/bundle/publicresource/80-N2040-53_REV_AB_Qualcomm_Hexagon_V73_Programmers_Reference_Manual.pdf#page=90)
- [Register definition in LLVM](https://github.com/llvm/llvm-project/blob/llvmorg-19.1.0/llvm/lib/Target/Hexagon/HexagonRegisterInfo.td#L155)

cc `@androm3da` (target maintainer of hexagon-unknown-{[none-elf](https://doc.rust-lang.org/nightly/rustc/platform-support/hexagon-unknown-none-elf.html#target-maintainers),[linux-musl](https://doc.rust-lang.org/nightly/rustc/platform-support/hexagon-unknown-linux-musl.html#target-maintainers)})

r? `@Amanieu`

`@rustbot` label +A-inline-assembly
(Currently there is no O-hexagon label...)
2024-11-28 12:06:02 +01:00
Guillaume Gomez
89ae19ee0d
Rollup merge of #133422 - taiki-e:riscv-e-clobber-abi, r=Amanieu
Fix clobber_abi in RV32E and RV64E inline assembly

Currently clobber_abi in RV32E and RV64E inline assembly is implemented using InlineAsmClobberAbi::RiscV, but broken since x16-x31 cannot be used in RV32E and RV64E.

```
error: cannot use register `x16`: register can't be used with the `e` target feature
  --> <source>:42:14
   |
42 |     asm!("", clobber_abi("C"), options(nostack, nomem, preserves_flags));
   |              ^^^^^^^^^^^^^^^^

error: cannot use register `x17`: register can't be used with the `e` target feature
  --> <source>:42:14
   |
42 |     asm!("", clobber_abi("C"), options(nostack, nomem, preserves_flags));
   |              ^^^^^^^^^^^^^^^^

error: cannot use register `x28`: register can't be used with the `e` target feature
  --> <source>:42:14
   |
42 |     asm!("", clobber_abi("C"), options(nostack, nomem, preserves_flags));
   |              ^^^^^^^^^^^^^^^^

error: cannot use register `x29`: register can't be used with the `e` target feature
  --> <source>:42:14
   |
42 |     asm!("", clobber_abi("C"), options(nostack, nomem, preserves_flags));
   |              ^^^^^^^^^^^^^^^^

error: cannot use register `x30`: register can't be used with the `e` target feature
  --> <source>:42:14
   |
42 |     asm!("", clobber_abi("C"), options(nostack, nomem, preserves_flags));
   |              ^^^^^^^^^^^^^^^^

error: cannot use register `x31`: register can't be used with the `e` target feature
  --> <source>:42:14
   |
42 |     asm!("", clobber_abi("C"), options(nostack, nomem, preserves_flags));
   |              ^^^^^^^^^^^^^^^^
```

r? `@Amanieu`

`@rustbot` label O-riscv +A-inline-assembly
2024-11-28 12:06:01 +01:00
Eric Huss
f94142b366 Update tests to use new proc-macro header 2024-11-27 07:18:25 -08:00
Taiki Endo
687dc19cb6 Fix handling of x18 in AArch64 inline assembly on ohos/trusty or with -Zfixed-x18 2024-11-26 03:10:22 +09:00
Hans Wennborg
402bdd183b Update test expectations to accept LLVM 'initializes' attribute
The test was checking for two `ptr` arguments by matching commas (or
non-commas), however after
https://github.com/llvm/llvm-project/pull/117104 LLVM adds an
`initializes((0, 16))` attribute, which includes a comma.

So instead, we make the test check for two LLVM values, i.e. something
prefixed by %.

(See also https://crbug.com/380707238)
2024-11-25 15:30:35 +01:00
Taiki Endo
59f01cdbf4 Support predicate registers (clobber-only) in Hexagon inline assembly 2024-11-25 23:11:17 +09:00
Taiki Endo
736c397f41 Fix clobber_abi in RV32E and RV64E inline assembly 2024-11-25 00:36:22 +09:00
Gary Guo
0178ba2c25 Make asm_goto_with_outputs a separate feature gate 2024-11-24 15:24:01 +00:00
Gary Guo
73f8309300 Support use of asm goto with outputs and options(noreturn)
When labels are present, the `noreturn` option really means that asm block
won't fallthrough -- if labels are present, then outputs can still be
meaningfully used.
2024-11-24 14:18:10 +00:00
Gary Guo
b8df869ebb Fix asm goto with outputs
When outputs are used together with labels, they are considered
to be written for all destinations, not only when falling through.
2024-11-24 14:18:10 +00:00
许杰友 Jieyou Xu (Joe)
c6d36256a6
Rollup merge of #127483 - BertalanD:no_sanitize-global-var, r=rcvalle
Allow disabling ASan instrumentation for globals

AddressSanitizer adds instrumentation to global variables unless the [`no_sanitize_address`](https://llvm.org/docs/LangRef.html#global-attributes) attribute is set on them.

This commit extends the existing `#[no_sanitize(address)]` attribute to set this; previously it only had the desired effect on functions.

(cc https://github.com/rust-lang/rust/issues/39699)
2024-11-23 20:19:51 +08:00
Michael Goulet
7b40a9b7c6
Rollup merge of #133102 - RalfJung:aarch64-softfloat, r=davidtwco,wesleywiser
aarch64 softfloat target: always pass floats in int registers

This is a part of https://github.com/rust-lang/rust/issues/131058: on softfloat aarch64 targets, the float registers may be unavailable. And yet, LLVM will happily use them to pass float types if the corresponding target features are enabled. That's a problem as it means enabling/disabling `neon` instructions can change the ABI.

Other targets have a `soft-float` target feature that forces the use of the soft-float ABI no matter whether float registers are enabled or not; aarch64 has nothing like that.

So we follow the aarch64 [softfloat ABI](https://github.com/rust-lang/rust/issues/131058#issuecomment-2385027423) and treat floats like integers for `extern "C"` functions. For the "Rust" ABI, we do the same for scalars, and then just do something reasonable for ScalarPair that avoids the pointer indirection.

Cc ```@workingjubilee```
2024-11-22 21:07:39 -05:00
Ralf Jung
666bcbdb2e aarch64 softfloat target: always pass floats in int registers 2024-11-20 20:41:28 +01:00
Jiri Bobek
777003ae9f Likely unlikely fix 2024-11-17 21:49:10 +01:00
Trevor Gross
5d818914af Always inline functions signatures containing f16 or f128
There are a handful of tier 2 and tier 3 targets that cause a LLVM crash
or linker error when generating code that contains `f16` or `f128`. The
cranelift backend also does not support these types. To work around
this, every function in `std` or `core` that contains these types must
be marked `#[inline]` in order to avoid sending any code to the backend
unless specifically requested.

However, this is inconvenient and easy to forget. Introduce a check for
these types in the frontend that automatically inlines any function
signatures that take or return `f16` or `f128`.

Note that this is not a perfect fix because it does not account for the
types being passed by reference or as members of aggregate types, but
this is sufficient for what is currently needed in the standard library.

Fixes: https://github.com/rust-lang/rust/issues/133035
Closes: https://github.com/rust-lang/rust/pull/133037
2024-11-14 16:18:41 -06:00
许杰友 Jieyou Xu (Joe)
91fa16b211 tests: use max-llvm-major-version instead of ignore-llvm-version range like N - 99
For tests that use `ignore-llvm-version: N - M`, replace that with
`max-llvm-major-version: N-1`.
2024-11-14 17:44:54 +08:00
Kirill Podoprigora
98a71766b8 Add `exact-llvm-major-version` directive 2024-11-13 15:05:31 +02:00
Matthias Krüger
bd79fe7a94
Rollup merge of #132702 - 1c3t3a:issue-132615, r=rcvalle
CFI: Append debug location to CFI blocks

Currently we're not appending debug locations to the inserted CFI blocks. This shows up in #132615 and #100783. This change fixes that by passing down the debug location to the CFI type-test generation and appending it to the blocks.

Credits also belong to `@jakos-sec` who worked with me on this.
2024-11-12 23:26:41 +01:00
Bastian Kersting
c2102259a0 CFI: Append debug location to CFI blocks 2024-11-11 09:17:43 +00:00
Taiki Endo
965a2801a0 Stabilize Arm64EC inline assembly 2024-11-10 17:43:46 +09:00
Jubilee
2f98dcf9ba
Rollup merge of #131258 - taiki-e:s390x-stabilize-asm, r=Amanieu
Stabilize s390x inline assembly

This stabilizes inline assembly for s390x (SystemZ).

Corresponding reference PR: https://github.com/rust-lang/reference/pull/1643

---

From the requirements of stabilization mentioned in https://github.com/rust-lang/rust/issues/93335

> Each architecture needs to be reviewed before stabilization:

> - It must have clobber_abi.

Done in https://github.com/rust-lang/rust/pull/130630.

> - It must be possible to clobber every register that is normally clobbered by a function call.

Done in the PR that added support for clobber_abi.

> - Generally review that the exposed register classes make sense.

The followings can be used as input/output:

- `reg` (`r[0-10]`, `r[12-14]`): General-purpose register

- `reg_addr` (`r[1-10]`, `r[12-14]`): General-purpose register except `r0` which is evaluated as zero in an address context

  This class is needed because `r0`, which may be allocated when using the `reg` class, cannot be used as a register in certain contexts. This is identical to the `a` constraint in LLVM and GCC. See https://github.com/rust-lang/rust/pull/119431 for details.

- `freg` (`f[0-15]`): Floating-point register

The followings are clobber-only:

- `vreg` (`v[0-31]`): Vector register

  Technically `vreg` should be able to accept `#[repr(simd)]` types as input/output if the unstable `vector` target feature added is enabled, but `core::arch` has no s390x vector type and both `#[repr(simd)]` and `core::simd` are unstable. Everything related is unstable, so the fact that this is currently a clobber-only should not be considered a stabilization blocker. (https://github.com/rust-lang/rust/issues/130869 tracks unstable stuff here)

- `areg` (`a[2-15]`): Access register

All of the above register classes except `reg_addr` are needed for `clobber_abi`.

The followings cannot be used as operands for inline asm (see also [getReservedRegs](https://github.com/llvm/llvm-project/blob/llvmorg-19.1.0/llvm/lib/Target/SystemZ/SystemZRegisterInfo.cpp#L258-L282) and [SystemZELFRegisters](https://github.com/llvm/llvm-project/blob/llvmorg-19.1.0/llvm/lib/Target/SystemZ/SystemZRegisterInfo.h#L107-L128) in LLVM):

- `r11`: frame pointer
- `r15`: stack pointer
- `a0`, `a1`: Reserved for system use
- `c[0-15]` (control register)  Reserved by the kernel

Although not listed in the above requirements, `preserves_flags` is implemented in https://github.com/rust-lang/rust/pull/111331.

---

cc ``@uweigand``

r? ``@Amanieu``

``@rustbot`` label +O-SystemZ +A-inline-assembly
2024-11-09 20:28:43 -08:00
Kyle Huey
1dc106121b Add discriminators to DILocations when multiple functions are inlined into a single point.
LLVM does not expect to ever see multiple dbg_declares for the same variable at the same
location with different values. proc-macros make it possible for arbitrary code,
including multiple calls that get inlined, to happen at any given location in the source
code. Add discriminators when that happens so these locations are different to LLVM.

This may interfere with the AddDiscriminators pass in LLVM, which is added by the
unstable flag -Zdebug-info-for-profiling.

Fixes #131944
2024-11-09 08:01:31 -08:00
Matthias Krüger
2829c2725a
Rollup merge of #132777 - durin42:llvm-20-poison-prop, r=nikic
try_question_mark_nop: update test for LLVM 20

llvm/llvm-project@dd116369f6 changes the IR of this test in a way that I don't think is bad, but needs adjusting.

r? `@nikic`
`@rustbot` label: +llvm-main
2024-11-09 10:52:04 +01:00
Matthias Krüger
6e05afd744
Rollup merge of #132745 - RalfJung:pointee-info-inside-enum, r=DianQK
pointee_info_at: fix logic for recursing into enums

Fixes https://github.com/rust-lang/rust/issues/131834

The logic in `pointee_info_at` was likely written at a time when the null pointer optimization was the *only* enum layout optimization -- and as `Variant::Multiple` kept getting expanded, nobody noticed that the logic is now unsound.

The job of this function is to figure out whether there is a dereferenceable-or-null and aligned pointer at a given offset inside a type. So when we recurse into a multi-variant enum, we better make sure that all the other enum variants must be null! This is the part that was forgotten, and this PR adds it.

The reason this didn't explode in many ways so far is that our references only have 1 niche value (null), so it's not possible on stable to have a multi-variant enum with a dereferenceable pointer and other enum variants that are not null. But with `rustc_layout_scalar_valid_range` attributes one can force such a layout, and if `@the8472's` work on alignment niches ever lands, that will make this possible on stable.
2024-11-09 10:52:03 +01:00
Augie Fackler
8fcc020a0c try_question_mark_nop: update test for LLVM 20
llvm/llvm-project@dd116369f6 changes the
IR of this test in a way that I don't think is bad, but needs adjusting.

r? @nikic
@rustbot label: +llvm-main
2024-11-08 10:43:06 -05:00
Ralf Jung
35a913b968 pointee_info_at: fix logic for recursing into enums 2024-11-08 07:35:29 +01:00
Jubilee
e2d429e22f
Rollup merge of #132740 - zmodem:simd_syntax_update, r=durin42
Update test for LLVM 20's new vector splat syntax

that was introduced in https://github.com/llvm/llvm-project/pull/112548
2024-11-07 18:48:25 -08:00
Jubilee
93e9ec05a9
Rollup merge of #131913 - jieyouxu:only_debug_assertions, r=onur-ozkan
Add `{ignore,needs}-{rustc,std}-debug-assertions` directive support

Add `{ignore,needs}-{rustc,std}-debug-assertions` compiletest directives and retire the old `{ignore,only}-debug` directives. The old `{ignore,only}-debug` directives were ambiguous because you could have std built with debug assertions but rustc not built with debug assertions or vice versa. If we want to support the use case of controlling test run based on if rustc was built with debug assertions, then having `{ignore,only}-debug` will be very confusing.

cc ````@matthiaskrgr````

Closes #123987.

r? bootstrap (or compiler tbh)
2024-11-07 18:48:21 -08:00
Taiki Endo
ab62a352ba Stabilize s390x inline assembly 2024-11-08 10:46:00 +09:00
Hans Wennborg
0e58f1c21a Update test for LLVM 20's new vector splat syntax
that was introduced in https://github.com/llvm/llvm-project/pull/112548
2024-11-07 20:02:20 +01:00
Taiki Endo
241f82ad91 Basic inline assembly support for SPARC and SPARC64 2024-11-07 21:19:03 +09:00
Matt Weber
d151593781 Fix relative lines in coroutine test 2024-11-06 22:26:18 -05:00
Wesley Wiser
64dd582166 Use -DAG to handle use of file before definition
Also fixup the test assertion for msvc and unix
2024-11-06 22:26:18 -05:00
Matt Weber
613ddc199d Restructure compile-flags for tests
Optimization needs to be explicitly disabled now.
2024-11-06 22:26:18 -05:00
Matt Weber
7329111466 Update compile flags formatting 2024-11-06 22:26:18 -05:00
Matt Weber
8200068a1d Rename generator test file 2024-11-06 22:26:18 -05:00
Matt Weber
21c58b1b2c Rename option and add doc 2024-11-06 22:26:18 -05:00
Matt Weber
a4833a8089 Move additional source location info behind -Z option 2024-11-06 22:26:17 -05:00
Matt Weber
b6659b0621 Move tests into issues directory 2024-11-06 22:26:17 -05:00
Matt Weber
fc59f2c28c Refactor and expand enum test 2024-11-06 22:26:17 -05:00
Matt Weber
aa1a16a359 Add test for async function and async block 2024-11-06 22:26:17 -05:00
Matt Weber
f92fc886f4 Allow check lines to exceed normal length limit 2024-11-06 22:26:17 -05:00
Matt Weber
937a00b78c Separate NONMSVC and MSVC checks 2024-11-06 22:26:17 -05:00
Matt Weber
cc4f214a78 Split metadata testing into multiple files
This helps with the fact that the order in which debuginfo is emitted
differs between platforms, and is probably not guaranteed to be stable
in general.
2024-11-06 22:26:15 -05:00
Matt Weber
aa485fc2a1 Add file and line metadata for enum variant and fields 2024-11-06 22:25:04 -05:00
Matt Weber
af6b0deaf3 Add file and line metadata for struct/union members 2024-11-06 22:25:04 -05:00
Matt Weber
c07797a854 Add file and line metadata for coroutines 2024-11-06 22:24:58 -05:00
Matt Weber
f3da828185 Refactor type_map::stub parameters
Push span lookup into `type_map::stub` and pass the `DefId` instead of
doing the lookup outside and passing in the location metadata.
2024-11-06 22:12:03 -05:00
Matt Weber
94669d9d47 Add file and line metadata for closures 2024-11-06 22:12:03 -05:00
Matt Weber
5b23c38dd5 Add codegen test to validate IR for debuginfo 2024-11-06 22:12:02 -05:00
okaneco
1b5c02b757 Add is_ascii function optimized for x86-64 for [u8]
The new `is_ascii` function is optimized to use the
`pmovmskb` vector instruction which tests the high bit in a lane.
This corresponds to the same check of whether a byte is ASCII so
ASCII validity checking can be vectorized. This instruction
does not exist on other platforms so it is likely to regress performance
and is gated to all(target_arch = "x86_64", target_feature = "sse2").

Add codegen test
Remove crate::mem import for functions included in the prelude
2024-11-06 02:22:00 -05:00