Commit Graph

936 Commits

Author SHA1 Message Date
bors
b2728d5426 Auto merge of #135674 - scottmcm:assume-better, r=estebank
Update our range `assume`s to the format that LLVM prefers

I found out in https://github.com/llvm/llvm-project/issues/123278#issuecomment-2597440158 that the way I started emitting the `assume`s in #109993 was suboptimal, and as seen in that LLVM issue the way we're doing it -- with two `assume`s sometimes -- can at times lead to CVP/SCCP not realize what's happening because one of them turns into a `ne` instead of conveying a range.

So this updates how it's emitted from
```
assume( x >= LOW );
assume( x <= HIGH );
```
or
```
// (for ranges that wrap the range)
assume( (x <= LOW) | (x >= HIGH) );
```
to
```
assume( (x - LOW) <= (HIGH - LOW) );
```
so that we don't need multiple `icmp`s nor multiple `assume`s for a single value, and both wrappping and non-wrapping ranges emit the same shape.

(And we don't bother emitting the subtraction if `LOW` is zero, since that's trivial for us to check too.)
2025-01-22 04:18:30 +00:00
bors
ed43cbcb88 Auto merge of #134299 - RalfJung:remove-start, r=compiler-errors
remove support for the (unstable) #[start] attribute

As explained by `@Noratrieb:`
`#[start]` should be deleted. It's nothing but an accidentally leaked implementation detail that's a not very useful mix between "portable" entrypoint logic and bad abstraction.

I think the way the stable user-facing entrypoint should work (and works today on stable) is pretty simple:
- `std`-using cross-platform programs should use `fn main()`. the compiler, together with `std`, will then ensure that code ends up at `main` (by having a platform-specific entrypoint that gets directed through `lang_start` in `std` to `main` - but that's just an implementation detail)
- `no_std` platform-specific programs should use `#![no_main]` and define their own platform-specific entrypoint symbol with `#[no_mangle]`, like `main`, `_start`, `WinMain` or `my_embedded_platform_wants_to_start_here`. most of them only support a single platform anyways, and need cfg for the different platform's ways of passing arguments or other things *anyways*

`#[start]` is in a super weird position of being neither of those two. It tries to pretend that it's cross-platform, but its signature is  a total lie. Those arguments are just stubbed out to zero on ~~Windows~~ wasm, for example. It also only handles the platform-specific entrypoints for a few platforms that are supported by `std`, like Windows or Unix-likes. `my_embedded_platform_wants_to_start_here` can't use it, and neither could a libc-less Linux program.
So we have an attribute that only works in some cases anyways, that has a signature that's a total lie (and a signature that, as I might want to add, has changed recently, and that I definitely would not be comfortable giving *any* stability guarantees on), and where there's a pretty easy way to get things working without it in the first place.

Note that this feature has **not** been RFCed in the first place.

*This comment was posted [in May](https://github.com/rust-lang/rust/issues/29633#issuecomment-2088596042) and so far nobody spoke up in that issue with a usecase that would require keeping the attribute.*

Closes https://github.com/rust-lang/rust/issues/29633

try-job: x86_64-gnu-nopt
try-job: x86_64-msvc-1
try-job: x86_64-msvc-2
try-job: test-various
2025-01-21 19:46:20 +00:00
Ralf Jung
56c90dc31e remove support for the #[start] attribute 2025-01-21 06:59:15 -07:00
Oli Scherer
8f5f5e56a8 Add more tests 2025-01-21 08:27:30 +00:00
Oli Scherer
dfa4c01b2e Treat undef bytes as equal to any other byte 2025-01-21 08:27:21 +00:00
Oli Scherer
964c58a7d9 Ensure we always get a constant, even without mir opts 2025-01-21 08:25:59 +00:00
Oli Scherer
8876cf7181 Also generate undef scalars and scalar pairs 2025-01-21 08:22:15 +00:00
Matthias Krüger
bbec1510bb
Rollup merge of #133695 - x17jiri:hint_likely, r=Amanieu
Reexport likely/unlikely in std::hint

Since `likely`/`unlikely` should be working now, we could reexport them in `std::hint`. I'm not sure if this is already approved or if it requires approval

Tracking issue: #26179
2025-01-20 20:58:34 +01:00
Scott McMurray
6fe82006a4 Update our range assumes to the format that LLVM prefers 2025-01-17 20:39:38 -08:00
bors
bcd0683e5d Auto merge of #135534 - folkertdev:fix-wasm-i128-f128, r=tgross35
use indirect return for `i128` and `f128` on wasm32

fixes #135532

Based on https://github.com/WebAssembly/tool-conventions/blob/main/BasicCABI.md we now use an indirect return for  `i128`, `u128` and `f128`. That is what LLVM ended up doing anyway.

r? `@bjorn3`
2025-01-17 15:07:28 +00:00
bors
0c2c096e1a Auto merge of #135047 - Flakebi:amdgpu-kernel-cc, r=workingjubilee
Add gpu-kernel calling convention

The amdgpu-kernel calling convention was reverted in commit f6b21e90d1 (#120495 and https://github.com/rust-lang/rust-analyzer/pull/16463) due to inactivity in the amdgpu target.

Introduce a `gpu-kernel` calling convention that translates to `ptx_kernel` or `amdgpu_kernel`, depending on the target that rust compiles for.

Tracking issue: #135467
amdgpu target tracking issue: #135024
2025-01-17 04:36:09 +00:00
Folkert de Vries
702134a930
use indirect return for i128 and f128 on wasm32 2025-01-16 13:25:40 +01:00
Flakebi
e7e5202978
Add gpu-kernel calling convention
The amdgpu-kernel calling convention was reverted in commit
f6b21e90d1 due to inactivity in the amdgpu
target.

Introduce a `gpu-kernel` calling convention that translates to
`ptx_kernel` or `amdgpu_kernel`, depending on the target that rust
compiles for.
2025-01-16 00:26:55 +01:00
Jiri Bobek
c656f879c9 Export likely(), unlikely() and cold_path() in std::hint 2025-01-15 21:42:47 +01:00
bors
7a202a9056 Auto merge of #135204 - RalfJung:win64-zst, r=SparrowLii
fix handling of ZST in win64 ABI on windows-msvc targets

The Microsoft calling conventions do not really say anything about ZST since they do not seem to exist in MSVC. However, both GCC and clang allow passing ZST over  `__attribute__((ms_abi))` functions (which matches our `extern "win64" fn`) on `windows-gnu` targets, and therefore implicitly define a de-facto ABI for these types (and lucky enough they seem to define the same ABI). This ABI should be the same for windows-msvc and windows-gnu targets, so we use this as a hint for how to implement this ABI everywhere: we always pass ZST by-ref.

The best alternative would be to just reject compiling functions which cannot exist in MSVC, but that would be a breaking change.

Cc `@programmerjake` `@ChrisDenton`
Fixes https://github.com/rust-lang/rust/issues/132893
2025-01-13 13:05:53 +00:00
Ralf Jung
675a1036ca on Windows, consistently pass ZST by-ref 2025-01-12 13:32:36 +01:00
Matthias Krüger
b8e230a824
Rollup merge of #134030 - folkertdev:min-fn-align, r=workingjubilee
add `-Zmin-function-alignment`

tracking issue: https://github.com/rust-lang/rust/issues/82232

This PR adds the `-Zmin-function-alignment=<align>` flag, that specifies a minimum alignment for all* functions.

### Motivation

This feature is requested by RfL [here](https://github.com/rust-lang/rust/issues/128830):

> i.e. the equivalents of `-fmin-function-alignment` ([GCC](https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-fmin-function-alignment_003dn), Clang does not support it) / `-falign-functions` ([GCC](https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-falign-functions), [Clang](https://clang.llvm.org/docs/ClangCommandLineReference.html#cmdoption-clang1-falign-functions)).
>
> For the Linux kernel, the behavior wanted is that of GCC's `-fmin-function-alignment` and Clang's `-falign-functions`, i.e. align all functions, including cold functions.
>
> There is [`feature(fn_align)`](https://github.com/rust-lang/rust/issues/82232), but we need to do it globally.

### Behavior

The `fn_align` feature does not have an RFC. It was decided at the time that it would not be necessary, but maybe we feel differently about that now? In any case, here are the semantics of this flag:

- `-Zmin-function-alignment=<align>` specifies the minimum alignment of all* functions
- the `#[repr(align(<align>))]` attribute can be used to override the function alignment on a per-function basis: when `-Zmin-function-alignment` is specified, the attribute's value is only used when it is higher than the value passed to `-Zmin-function-alignment`.
- the target may decide to use a higher value (e.g. on x86_64 the minimum that LLVM generates is 16)
- The highest supported alignment in rust is `2^29`: I checked a bunch of targets, and they all emit the `.p2align        29` directive for targets that align functions at all (some GPU stuff does not have function alignment).

*: Only with `build-std` would the minimum alignment also be applied to `std` functions.

---

cc `@ojeda`

r? `@workingjubilee` you were active on the tracking issue
2025-01-11 18:13:45 +01:00
Jacob Pratt
351e6188a8
Rollup merge of #135236 - scottmcm:more-mcp807-library-updates, r=ChrisDenton
Update a bunch of library types for MCP807

This greatly reduces the number of places that actually use the `rustc_layout_scalar_valid_range_*` attributes down to just 3:
```
library/core\src\ptr\non_null.rs
68:#[rustc_layout_scalar_valid_range_start(1)]

library/core\src\num\niche_types.rs
19:        #[rustc_layout_scalar_valid_range_start($low)]
20:        #[rustc_layout_scalar_valid_range_end($high)]
```

Everything else -- PAL Nanoseconds, alloc's `Cap`, niched FDs, etc -- all just wrap those `niche_types` types.

r? ghost
2025-01-11 01:55:05 -05:00
Folkert de Vries
47573bf61e
add -Zmin-function-alignment 2025-01-10 22:53:54 +01:00
Oli Scherer
65b01cb182 Use llvm.memset.p0i8.* to initialize all same-bytes arrays 2025-01-10 15:22:06 +00:00
Ralf Jung
d760bb6603 fix ZST handling for Windows ABIs on MSVC target 2025-01-10 12:16:49 +01:00
Oli Scherer
aec51564a5 Add regression test for option initialization 2025-01-10 08:27:41 +00:00
Scott McMurray
6f2a78345e Update a bunch of library types for MCP807
This greatly reduces the number of places that actually use the `rustc_layout_scalar_valid_range_*` attributes down to just 3:
```
library/core\src\ptr\non_null.rs
68:#[rustc_layout_scalar_valid_range_start(1)]

library/core\src\num\niche_types.rs
19:        #[rustc_layout_scalar_valid_range_start($low)]
20:        #[rustc_layout_scalar_valid_range_end($high)]
```

Everything else -- PAL Nanoseconds, alloc's `Cap`, niched FDs, etc -- all just wrap those `niche_types` types.
2025-01-09 23:47:11 -08:00
Jacob Pratt
4e4a93c2dd
Rollup merge of #131830 - hoodmane:emscripten-wasm-eh, r=workingjubilee
Add support for wasm exception handling to Emscripten target

This is a draft because we need some additional setting for the Emscripten target to select between the old exception handling and the new exception handling. I don't know how to add a setting like that, would appreciate advice from Rust folks. We could maybe choose to use the new exception handling if `Ctarget-feature=+exception-handling` is passed? I tried this but I get errors from llvm so I'm not doing it right.
2025-01-06 22:04:13 -05:00
bors
243d2ca4db Auto merge of #135112 - tgross35:combine-select-unpredictable-test, r=the8472
Merge the intrinsic and user tests for `select_unpredictable`

[1] mentions that having a single test with `-Zmerge-functions=disabled` is preferable to having two separate tests.  Apply that to the new `select_unpredictable` test here.

[1]: https://github.com/rust-lang/rust/pull/133964#issuecomment-2569693325
2025-01-06 10:52:07 +00:00
Hood Chatham
49c74234a7 Add support for wasm exception handling to Emscripten target
Gated behind an unstable `-Z emscripten-wasm-eh` flag
2025-01-06 10:29:54 +01:00
bors
feb32c6546 Auto merge of #134794 - RalfJung:abi-required-target-features, r=workingjubilee
Add a notion of "some ABIs require certain target features"

I think I finally found the right shape for the data and checks that I recently added in https://github.com/rust-lang/rust/pull/133099, https://github.com/rust-lang/rust/pull/133417, https://github.com/rust-lang/rust/pull/134337: we have a notion of "this ABI requires the following list of target features, and it is incompatible with the following list of target features". Both `-Ctarget-feature` and `#[target_feature]` are updated to ensure we follow the rules of the ABI.  This removes all the "toggleability" stuff introduced before, though we do keep the notion of a fully "forbidden" target feature -- this is needed to deal with target features that are actual ABI switches, and hence are needed to even compute the list of required target features.

We always explicitly (un)set all required and in-conflict features, just to avoid potential trouble caused by the default features of whatever the base CPU is. We do this *before* applying `-Ctarget-feature` to maintain backward compatibility; this poses a slight risk of missing some implicit feature dependencies in LLVM but has the advantage of not breaking users that deliberately toggle ABI-relevant target features. They get a warning but the feature does get toggled the way they requested.

For now, our logic supports x86, ARM, and RISC-V (just like the previous logic did). Unsurprisingly, RISC-V is the nicest. ;)

As a side-effect this also (unstably) allows *enabling* `x87` when that is harmless. I used the opportunity to mark SSE2 as required on x86-64, to better match the actual logic in LLVM and because all x86-64 chips do have SSE2. This infrastructure also prepares us for requiring SSE on x86-32 when we want to use that for our ABI (and for float semantics sanity), see https://github.com/rust-lang/rust/issues/133611, but no such change is happening in this PR.

r? `@workingjubilee`
2025-01-05 23:21:06 +00:00
Trevor Gross
74d2d4bfa4 Expand the select_unpredictable test for ZSTs
For ZSTs there is no selection that needs to take place, so assert that
no `select` statement is emitted.
2025-01-05 08:51:15 +00:00
Trevor Gross
d42c3ae02f Merge the intrinsic and user tests for select_unpredictable
[1] mentions that having a single test with `-Zmerge-functions=disabled`
is preferable to having two separate tests.  Apply that to the new
`select_unpredicatble` test here.

[1]: https://github.com/rust-lang/rust/pull/133964#issuecomment-2569693325
2025-01-05 01:17:07 +00:00
Matthias Krüger
75e412b8d1
Rollup merge of #135084 - maurer:nuw, r=nikic
Update carrying_mul_add test to tolerate `nuw`

LLVM 20 adds nuw to GEP operations in this code, tolerate them.

`@rustbot` label: +llvm-main

r? `@durin42`
2025-01-04 09:54:40 +01:00
Matthias Krüger
695da5b782
Rollup merge of #133964 - joboet:select_unpredictable, r=tgross35
core: implement `bool::select_unpredictable`

Tracking issue: #133962
ACP: https://github.com/rust-lang/libs-team/issues/468
2025-01-04 09:54:36 +01:00
Matthew Maurer
ed005245c6 Update carrying_mul_add test to tolerate nuw
LLVM 20 adds nuw to GEP operations in this code, tolerate them.
2025-01-03 20:25:14 +00:00
joboet
8f3aa358bf
add codegen test for bool::select_unpredictable 2025-01-03 19:44:08 +01:00
Ralf Jung
43ede97ebf arm: use target.llvm_floatabi over soft-float target feature 2024-12-31 12:41:20 +01:00
Ralf Jung
912b7291d0 add ABI target features *before* -Ctarget-features 2024-12-31 12:41:20 +01:00
Ralf Jung
eb527424a5 x86-64 hardfloat actually requires sse2 2024-12-31 12:41:20 +01:00
Ralf Jung
2bf27e09be explicitly model that certain ABIs require/forbid certain target features 2024-12-31 12:41:20 +01:00
bors
4e5fec2f1e Auto merge of #134757 - RalfJung:const_swap, r=scottmcm
stabilize const_swap

libs-api FCP passed in https://github.com/rust-lang/rust/issues/83163.

However, I only just realized that this actually involves an intrinsic. The intrinsic could be implemented entirely with existing stable const functionality, but we choose to make it a primitive to be able to detect more UB. So nominating for `@rust-lang/lang`  to make sure they are aware; I leave it up to them whether they want to FCP this.

While at it I also renamed the intrinsic to make the "nonoverlapping" constraint more clear.

Fixes #83163
2024-12-30 23:46:42 +00:00
Matthias Krüger
6c12546dc0
Rollup merge of #134871 - clubby789:test-63646, r=compiler-errors
Add codegen test for issue 63646

Closes #63646
2024-12-30 19:34:55 +01:00
Alex Gaynor
dab1c57723 Added codegen test for elidings bounds check when indexes are manually checked
Closes #55147
2024-12-29 08:02:40 -06:00
clubby789
71e3ea35b1 Add codegen test for issue 63646 2024-12-29 03:31:37 +00:00
Alex Gaynor
d6c73ebbf3 Added a codegen test for optimization with const arrays
Closes #107208
2024-12-28 13:28:35 -06:00
Scott McMurray
4669c0d756 Override carrying_mul_add in cg_llvm 2024-12-27 08:17:40 -08:00
Ralf Jung
7291b1eaf7 rename typed_swap → typed_swap_nonoverlapping 2024-12-25 10:53:03 +01:00
bors
303e8bd768 Auto merge of #131193 - EFanZh:asserts-vec-len, r=the8472
Asserts the maximum value that can be returned from `Vec::len`

Currently, casting `Vec<i32>` to `Vec<u32>` takes O(1) time:

```rust
// See <https://godbolt.org/z/hxq3hnYKG> for assembly output.
pub fn cast(vec: Vec<i32>) -> Vec<u32> {
    vec.into_iter().map(|e| e as _).collect()
}
```

But the generated assembly is not the same as the identity function, which prevents us from casting `Vec<Vec<i32>>` to `Vec<Vec<u32>>` within O(1) time:

```rust
// See <https://godbolt.org/z/7n48bxd9f> for assembly output.
pub fn cast(vec: Vec<Vec<i32>>) -> Vec<Vec<u32>> {
    vec.into_iter()
        .map(|e| e.into_iter().map(|e| e as _).collect())
        .collect()
}
```

This change tries to fix the problem. You can see the comparison here: <https://godbolt.org/z/jdManrKvx>.
2024-12-22 16:09:16 +00:00
bors
c1132470a6 Auto merge of #130733 - okaneco:is_ascii, r=scottmcm
Optimize `is_ascii` for `str` and `[u8]` further

Replace the existing optimized function with one that enables auto-vectorization.

This is especially beneficial on x86-64 as `pmovmskb` can be emitted with careful structuring of the code. The instruction can detect non-ASCII characters one vector register width at a time instead of the current `usize` at a time check.

The resulting implementation is completely safe.

`case00_libcore` is the current implementation, `case04_while_loop` is this PR.
```
benchmarks:
    ascii::is_ascii_slice::long::case00_libcore                             22.25/iter  +/- 1.09
    ascii::is_ascii_slice::long::case04_while_loop                           6.78/iter  +/- 0.92
    ascii::is_ascii_slice::medium::case00_libcore                            2.81/iter  +/- 0.39
    ascii::is_ascii_slice::medium::case04_while_loop                         1.56/iter  +/- 0.78
    ascii::is_ascii_slice::short::case00_libcore                             5.55/iter  +/- 0.85
    ascii::is_ascii_slice::short::case04_while_loop                          3.75/iter  +/- 0.22
    ascii::is_ascii_slice::unaligned_both_long::case00_libcore              26.59/iter  +/- 0.66
    ascii::is_ascii_slice::unaligned_both_long::case04_while_loop            5.78/iter  +/- 0.16
    ascii::is_ascii_slice::unaligned_both_medium::case00_libcore             2.97/iter  +/- 0.32
    ascii::is_ascii_slice::unaligned_both_medium::case04_while_loop          2.41/iter  +/- 0.10
    ascii::is_ascii_slice::unaligned_head_long::case00_libcore              23.71/iter  +/- 0.79
    ascii::is_ascii_slice::unaligned_head_long::case04_while_loop            7.83/iter  +/- 1.31
    ascii::is_ascii_slice::unaligned_head_medium::case00_libcore             3.69/iter  +/- 0.54
    ascii::is_ascii_slice::unaligned_head_medium::case04_while_loop          7.05/iter  +/- 0.32
    ascii::is_ascii_slice::unaligned_tail_long::case00_libcore              24.44/iter  +/- 1.41
    ascii::is_ascii_slice::unaligned_tail_long::case04_while_loop            5.12/iter  +/- 0.18
    ascii::is_ascii_slice::unaligned_tail_medium::case00_libcore             3.24/iter  +/- 0.40
    ascii::is_ascii_slice::unaligned_tail_medium::case04_while_loop          2.86/iter  +/- 0.14

```

`unaligned_head_medium` is the main regression in the benchmarks. It is a 32 byte string being sliced `bytes[1..]`.

The first commit can be used to run the benchmarks against the current core implementation.

Previous implementation was done in #74066

---

Two potential drawbacks of this implementation are that it increases instruction count and may regress other platforms/architectures. The benches here may also be too artificial to glean much insight from.
https://rust.godbolt.org/z/G9znGfY36
2024-12-22 02:44:13 +00:00
Taiki Endo
96edf41194 tests/codegen/asm: Remove uses of rustc_attrs and lang_items features by using minicore 2024-12-20 23:19:12 +09:00
许杰友 Jieyou Xu (Joe)
5415f067bd Explicitly register MSVC/NONMSVC revisions for some codegen tests 2024-12-19 20:36:51 +08:00
许杰友 Jieyou Xu (Joe)
aaca9fa482 compiletest: don't register MSVC/NONMSVC FileCheck prefixes
This was fragile as it was based on host target passed to compiletest,
but the user could cross-compile and run test for a different target
(e.g. cross from linux to msvc, but msvc won't be set on the target).
Furthermore, it was also very surprising as normally revision names
(other than `CHECK`) was accepted as FileCheck prefixes.
2024-12-19 20:36:51 +08:00
Josh Triplett
a105cd6066 Use field init shorthand where possible
Field init shorthand allows writing initializers like `tcx: tcx` as
`tcx`. The compiler already uses it extensively. Fix the last few places
where it isn't yet used.
2024-12-17 14:33:10 -08:00