Commit Graph

798 Commits

Author SHA1 Message Date
Lzu Tao
e7d6da3676 Add codegen test for array comparision opt 2024-05-21 00:09:25 +00:00
Joshua Wong
a4efe6fe27 add codegen test for issue 120493 2024-05-20 09:21:09 -05:00
Matthias Krüger
e4e75688c2
Rollup merge of #125184 - scottmcm:fix-thin-ptr-ice, r=jieyouxu
Fix ICE in non-operand `aggregate_raw_ptr` intrinsic codegen

Introduced in #123840
Found in #121571, cc `@clarfonthey`
2024-05-18 18:44:14 +02:00
Scott McMurray
f60f2e8cb0 Fix ICE in non-operand aggregate_raw_ptr instrinsic codegen 2024-05-16 09:43:42 -07:00
Trevor Gross
488ddd3bbc Fix assertion when attempting to convert f16 and f128 with as
These types are currently rejected for `as` casts by the compiler.
Remove this incorrect check and add codegen tests for all conversions
involving these types.
2024-05-16 04:07:02 -05:00
Scott McMurray
dcab06d7d2 Unify Rvalue::Aggregate paths in cg_ssa 2024-05-11 21:22:51 -07:00
klensy
d97ed2d349 fix few typo in filecheck annotations 2024-05-11 13:10:24 +03:00
Scott McMurray
c38f75c21f Make SSA aggregates without needing an alloca 2024-05-08 20:38:04 -07:00
Scott McMurray
443bdc0946 Add a codegen test for transparent aggregates 2024-05-08 20:36:11 -07:00
Arthur Eubanks
6c348aca4e Adjust dbg.value/dbg.declare checks for LLVM update
https://github.com/llvm/llvm-project/pull/89799 changes llvm.dbg.value/declare intrinsics to be in a different, out-of-instruction-line representation. For example
  call void @llvm.dbg.declare(...)
becomes
  #dbg_declare(...)

Update tests accordingly to work with both the old and new way.
2024-05-06 23:15:48 +00:00
Urgau
ed81578820 tests/ui: prepare some tests for --check-cfg by default 2024-05-04 11:30:38 +02:00
Alice Ryhl
40f0172c6a Add -Zfixed-x18
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
2024-05-03 14:32:08 +02:00
Josh Stone
1b79bb937f Add inline comments why we're forcing the target cpu 2024-05-01 16:54:20 -07:00
Josh Stone
706f06c39a Use an explicit x86-64 cpu in tests that are sensitive to it
There are a few tests that depend on some target features **not** being
enabled by default, and usually they are correct with the default x86-64
target CPU. However, in downstream builds we have modified the default
to fit our distros -- `x86-64-v2` in RHEL 9 and `x86-64-v3` in RHEL 10
-- and the latter especially trips tests that expect not to have AVX.

These cases are few enough that we can just set them back explicitly.
2024-05-01 15:25:26 -07:00
Matthias Krüger
d81e444c8e
Rollup merge of #124543 - maurer:llvm-range, r=nikic
codegen tests: Tolerate `range()` qualifications in enum tests

Current LLVM can infer range bounds on the i8s involved with these tests, and annotates it. Accept these bounds if present.

`@rustbot` label: +llvm-main

cc `@durin42`
2024-04-30 06:43:43 +02:00
Matthew Maurer
8101884b37 codegen tests: Tolerate range() qualifications in enum tests
Current LLVM can infer range bounds on the i8s involved with these
tests, and annotates it. Accept these bounds if present.
2024-04-30 00:02:49 +00:00
Krasimir Georgiev
52ea73a540 adapt a codegen test for llvm 19
No functional changes intended.

Found by our experimental rust + LLVM @ HEAD bot:
https://buildkite.com/llvm-project/rust-llvm-integrate-prototype/builds/27747#018f2570-018c-4b12-9c5a-38cf81453683/957-965
2024-04-29 13:03:45 +00:00
bors
284f94f9c0 Auto merge of #121298 - nikic:writable, r=cuviper
Set writable and dead_on_unwind attributes for sret arguments

Set the `writable` and `dead_on_unwind` attributes for `sret` arguments. This allows call slot optimization to remove more memcpy's.

See https://llvm.org/docs/LangRef.html#parameter-attributes for the specification of these attributes. In short, the statement we're making here is that:

 * The return slot is writable.
 * The return slot will not be read if the function unwinds.

Fixes https://github.com/rust-lang/rust/issues/90595.
2024-04-25 04:31:56 +00:00
Nikita Popov
976267b514 Add needs-unwind to codegen test
When compiled with -C panic=abort we'd generate an extra
panic_cannot_unwind shim in the variant calling C-unwind.
2024-04-25 11:44:32 +09:00
Nikita Popov
137775dd63 Fix incorrect CHECK-LABEL 2024-04-25 11:43:47 +09:00
Nikita Popov
3695af697e Set writable and dead_on_unwind attributes for sret arguments 2024-04-25 11:43:47 +09:00
Gary Guo
cfee72aa24 Fix tests and bless 2024-04-24 13:12:33 +01:00
Oli Scherer
aef0f4024a Error on using yield without also using #[coroutine] on the closure
And suggest adding the `#[coroutine]` to the closure
2024-04-24 08:05:29 +00:00
bors
29a56a3b1c Auto merge of #122053 - erikdesjardins:alloca, r=nikic
Stop using LLVM struct types for alloca

The alloca type has no semantic meaning, only the size (and alignment, but we specify it explicitly) matter. Using `[N x i8]` is a more direct way to specify that we want `N` bytes, and avoids relying on LLVM's struct layout. It is likely that a future LLVM version will change to an untyped alloca representation.

Split out from #121577.

r? `@ghost`
2024-04-24 03:00:44 +00:00
Matthias Krüger
918304b190
Rollup merge of #124003 - WaffleLapkin:dellvmization, r=scottmcm,RalfJung,antoyo
Dellvmize some intrinsics (use `u32` instead of `Self` in some integer intrinsics)

This implements https://github.com/rust-lang/compiler-team/issues/693 minus what was implemented in #123226.

Note: I decided to _not_ change `shl`/... builder methods, as it just doesn't seem worth it.

r? ``@scottmcm``
2024-04-23 20:17:51 +02:00
Trevor Spiteri
245cc23a2f add codegen test
The test confirms that when val < base, we do not divide or multiply.
2024-04-23 18:31:57 +02:00
Markus Reiter
33e68aadc9
Stabilize generic NonZero. 2024-04-22 18:48:47 +02:00
Mark Rousskov
f1ae5314be Avoid reloading Vec::len across grow_one in push
This saves an extra load from memory.
2024-04-20 21:07:00 -04:00
Scott McMurray
986d9f104b Make checked ops emit *unchecked* LLVM operations where feasible
For things with easily pre-checked overflow conditions -- shifts and unsigned subtraction -- write then checked methods in such a way that we stop emitting wrapping versions of them.

For example, today <https://rust.godbolt.org/z/qM9YK8Txb> neither
```rust
a.checked_sub(b).unwrap()
```
nor
```rust
a.checked_sub(b).unwrap_unchecked()
```
actually optimizes to `sub nuw`.  After this PR they do.
2024-04-18 18:11:21 -07:00
Scott McMurray
d05545c05d At debuginfo=0, don't inline debuginfo when inlining 2024-04-18 09:35:35 -07:00
Maybe Waffle
c2046c4b09 Add codegen tests for changed intrinsics 2024-04-16 12:35:22 +00:00
bors
5dcb678ad8 Auto merge of #122917 - saethlin:atomicptr-to-int, r=nikic
Add the missing inttoptr when we ptrtoint in ptr atomics

Ralf noticed this here: https://github.com/rust-lang/rust/pull/122220#discussion_r1535172094

Our previous codegen forgot to add the cast back to integer type. The code compiles anyway, because of course all locals are in-memory to start with, so previous codegen would do the integer atomic, store the integer to a local, then load a pointer from that local. Which is definitely _not_ what we wanted: That's an integer-to-pointer transmute, so all pointers returned by these `AtomicPtr` methods didn't have provenance. Yikes.

Here's the IR for `AtomicPtr::fetch_byte_add` on 1.76: https://godbolt.org/z/8qTEjeraY
```llvm
define noundef ptr `@atomicptr_fetch_byte_add(ptr` noundef nonnull align 8 %a, i64 noundef %v) unnamed_addr #0 !dbg !7 {
start:
  %0 = alloca ptr, align 8, !dbg !12
  %val = inttoptr i64 %v to ptr, !dbg !12
  call void `@llvm.lifetime.start.p0(i64` 8, ptr %0), !dbg !28
  %1 = ptrtoint ptr %val to i64, !dbg !28
  %2 = atomicrmw add ptr %a, i64 %1 monotonic, align 8, !dbg !28
  store i64 %2, ptr %0, align 8, !dbg !28
  %self = load ptr, ptr %0, align 8, !dbg !28
  call void `@llvm.lifetime.end.p0(i64` 8, ptr %0), !dbg !28
  ret ptr %self, !dbg !33
}
```

r? `@RalfJung`
cc `@nikic`
2024-04-15 08:07:47 +00:00
Matthias Krüger
4a0e9e0deb
Rollup merge of #123249 - goolmoos:naked_variadics, r=pnkfelix
do not add prolog for variadic naked functions

fixes #99858
2024-04-12 17:41:33 +02:00
Erik Desjardins
daaaacdcb3 remove alloca type from issue-105386-ub-in-debuginfo
It's irrelevant for the purposes of this test (there is only one alloca)
and its size changes depending on the target, so it can't be matched
easily.
2024-04-12 08:36:22 -04:00
Guy Shefy
9139d7252d do not add prolog for variadic naked functions
fixes #99858
2024-04-12 15:29:39 +03:00
Erik Desjardins
f4426c189f use [N x i8] for alloca types 2024-04-11 21:42:35 -04:00
Matthew Maurer
e70cf014b8 codegen tests: Tolerate nuw nsw on trunc
llvm/llvm-project#87910 infers `nuw` and `nsw` on some `trunc`
instructions we're doing `FileCheck` on. Tolerate but don't require them
to support both release and head LLVM.
2024-04-11 17:20:08 +00:00
León Orell Valerian Liehr
aac3f24054
Rollup merge of #122470 - tgross35:f16-f128-step4-libs-min, r=Amanieu
`f16` and `f128` step 4: basic library support

This is the next step after https://github.com/rust-lang/rust/pull/121926, another portion of https://github.com/rust-lang/rust/pull/114607

Tracking issue: https://github.com/rust-lang/rust/issues/116909

This PR adds the most basic operations to `f16` and `f128` that get lowered as LLVM intrinsics. This is a very small step but it seemed reasonable enough to add unopinionated basic operations before the larger modules that are built on top of them.

r? ```@Amanieu``` since you were pretty involved in the RFC
cc ```@compiler-errors```
```@rustbot``` label +T-libs-api +S-blocked +F-f16_and_f128
2024-04-11 01:56:23 +02:00
Trevor Gross
454de78ea3 Add basic library support for f16 and f128
Implement basic operation traits that get lowered to intrinsics. This
includes codegen tests for implemented operations.
2024-04-10 13:50:27 -04:00
bors
c2239bca5b Auto merge of #123185 - scottmcm:more-typed-copy, r=compiler-errors
Remove my `scalar_copy_backend_type` optimization attempt

I added this back in https://github.com/rust-lang/rust/pull/111999 , but I no longer think it's a good idea
- It had to get scaled back to only power-of-two things to not break a bunch of targets
- LLVM seems to be getting better at memcpy removal anyway
- Introducing vector instructions has seemed to sometimes (https://github.com/rust-lang/rust/pull/115515#issuecomment-1750069529) make autovectorization worse

So this removes it from the codegen crates entirely, and instead just tries to use <https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/traits/builder/trait.BuilderMethods.html#method.typed_place_copy> instead of direct `memcpy` so things will still use load/store when a type isn't `OperandValue::Ref`.
2024-04-10 16:32:41 +00:00
Scott McMurray
593e900ad2 Update 122805 test for PR 123185 2024-04-10 08:28:43 -07:00
Matthias Krüger
2ddf984594
Rollup merge of #123612 - kxxt:riscv-target-abi, r=jieyouxu,nikic,DianQK
Set target-abi module flag for RISC-V targets

Fixes cross-language LTO on RISC-V targets (Fixes #121924)
2024-04-10 04:27:40 +02:00
Scott McMurray
b5376ba601 Remove my scalar_copy_backend_type optimization attempt
I added this back in 111999, but I no longer think it's a good idea
- It had to get scaled back to only power-of-two things to not break a bunch of targets
- LLVM seems to be getting better at memcpy removal anyway
- Introducing vector instructions has seemed to sometimes (115515) make autovectorization worse

So this removes it from the codegen crates entirely, and instead just tries to use <https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/traits/builder/trait.BuilderMethods.html#method.typed_place_copy> instead of direct `memcpy` so things will still use load/store for immediates.
2024-04-09 08:51:32 -07:00
kxxt
f19c48e7a8 Set target-abi module flag for RISC-V targets
Fixes cross-language LTO on RISC-V targets (Fixes #121924)
2024-04-09 05:25:51 +02:00
bors
59c808fcd9 Auto merge of #122387 - DianQK:re-enable-early-otherwise-branch, r=cjgillot
Re-enable the early otherwise branch optimization

Closes #95162. Fixes #119014.

This is the first part of #121397.

An invalid enum discriminant can come from anywhere. We have to check to see if all successors contain the discriminant statement. This should have a pass to hoist instructions.

r? cjgillot
2024-04-09 01:02:29 +00:00
bors
ab5bda1aa7 Auto merge of #123645 - matthiaskrgr:rollup-yd8d7f1, r=matthiaskrgr
Rollup of 9 pull requests

Successful merges:

 - #122781 (Fix argument ABI for overaligned structs on ppc64le)
 - #123367 (Safe Transmute: Compute transmutability from `rustc_target::abi::Layout`)
 - #123518 (Fix `ByMove` coroutine-closure shim (for 2021 precise closure capturing behavior))
 - #123547 (bootstrap: remove unused pub fns)
 - #123564 (Don't emit divide-by-zero panic paths in `StepBy::len`)
 - #123578 (Restore `pred_known_to_hold_modulo_regions`)
 - #123591 (Remove unnecessary cast from `LLVMRustGetInstrProfIncrementIntrinsic`)
 - #123632 (parser: reduce visibility of unnecessary public `UnmatchedDelim`)
 - #123635 (CFI: Fix ICE in KCFI non-associated function pointers)

r? `@ghost`
`@rustbot` modify labels: rollup
2024-04-08 20:31:08 +00:00
Matthias Krüger
9570ac4d28
Rollup merge of #123564 - scottmcm:step-by-div-zero, r=joboet
Don't emit divide-by-zero panic paths in `StepBy::len`

I happened to notice today that there's actually two such calls emitted in the assembly: <https://rust.godbolt.org/z/1Wbbd3Ts6>

Since they're impossible, hopefully telling LLVM that will also help optimizations elsewhere.
2024-04-08 22:06:22 +02:00
Matthias Krüger
ecfc3384f1
Rollup merge of #122781 - nikic:ppc-abi-fix, r=cuviper
Fix argument ABI for overaligned structs on ppc64le

When passing a 16 (or higher) aligned struct by value on ppc64le, it needs to be passed as an array of `i128` rather than an array of `i64`. This will force the use of an even starting doubleword.

For the case of a 16 byte struct with alignment 16 it is important that `[1 x i128]` is used instead of `i128` -- apparently, the latter will get treated similarly to `[2 x i64]`, not exhibiting the correct ABI. Add a `force_array` flag to `Uniform` to support this.

The relevant clang code can be found here:
fe2119a7b0/clang/lib/CodeGen/Targets/PPC.cpp (L878-L884)
fe2119a7b0/clang/lib/CodeGen/Targets/PPC.cpp (L780-L784)

I think the corresponding psABI wording is this:

> Fixed size aggregates and unions passed by value are mapped to as
> many doublewords of the parameter save area as the value uses in
> memory. Aggregrates and unions are aligned according to their
> alignment requirements. This may result in doublewords being
> skipped for alignment.

In particular the last sentence. Though I didn't find any wording for Clang's behavior of clamping the alignment to 16.

Fixes https://github.com/rust-lang/rust/issues/122767.

r? `@cuviper`
2024-04-08 22:06:20 +02:00
bors
211518e5fb Auto merge of #120614 - DianQK:simplify-switch-int, r=cjgillot
Transforms match into an assignment statement

Fixes #106459.

We should be able to do some similar transformations, like `enum` to `enum`.

r? mir-opt
2024-04-08 18:28:50 +00:00
bors
537aab7a2e Auto merge of #120131 - oli-obk:pattern_types_syntax, r=compiler-errors
Implement minimal, internal-only pattern types in the type system

rebase of https://github.com/rust-lang/rust/pull/107606

You can create pattern types with `std::pat::pattern_type!(ty is pat)`. The feature is incomplete and will panic on you if you use any pattern other than integral range patterns. The only way to create or deconstruct a pattern type is via `transmute`.

This PR's implementation differs from the MCP's text. Specifically

> This means you could implement different traits for different pattern types with the same base type. Thus, we just forbid implementing any traits for pattern types.

is violated in this PR. The reason is that we do need impls after all in order to make them usable as fields. constants of type `std::time::Nanoseconds` struct are used in patterns, so the type must be structural-eq, which it only can be if you derive several traits on it. It doesn't need to be structural-eq recursively, so we can just manually implement the relevant traits on the pattern type and use the pattern type as a private field.

Waiting on:

* [x] move all unrelated commits into their own PRs.
* [x] fix niche computation (see 2db07f94f44f078daffe5823680d07d4fded883f)
* [x] add lots more tests
* [x] T-types MCP https://github.com/rust-lang/types-team/issues/126 to finish
* [x] some commit cleanup
* [x] full self-review
* [x] remove 61bd325da19a918cc3e02bbbdce97281a389c648, it's not necessary anymore I think.
* [ ] ~~make sure we never accidentally leak pattern types to user code (add stability checks or feature gate checks and appopriate tests)~~ we don't even do this for the new float primitives
* [x] get approval that [the scope expansion to trait impls](https://rust-lang.zulipchat.com/#narrow/stream/326866-t-types.2Fnominated/topic/Pattern.20types.20types-team.23126/near/427670099) is ok

r? `@BoxyUwU`
2024-04-08 16:25:23 +00:00
Oli Scherer
84acfe86de Actually create ranged int types in the type system. 2024-04-08 12:02:19 +00:00
DianQK
928c57dc9a
Add test case for #119014 2024-04-08 19:20:04 +08:00
DianQK
1f061f47e2
Transforms match into an assignment statement 2024-04-08 19:00:53 +08:00
Philippe-Cholet
7a2678de7d Add invariant to VecDeque::pop_* that len < cap if pop successful
Similar to #114370 for VecDeque instead of Vec. It now uses `core::hint::assert_unchecked`.
2024-04-08 12:12:13 +02:00
Kai Luo
d8d1e6ce21 Limited to little endian target 2024-04-08 11:11:11 +08:00
Nikita Popov
009280c5e3 Fix argument ABI for overaligned structs on ppc64le
When passing a 16 (or higher) aligned struct by value on ppc64le,
it needs to be passed as an array of `i128` rather than an array
of `i64`. This will force the use of an even starting register.

For the case of a 16 byte struct with alignment 16 it is important
that `[1 x i128]` is used instead of `i128` -- apparently, the
latter will get treated similarly to `[2 x i64]`, not exhibiting
the correct ABI. Add a `force_array` flag to `Uniform` to support
this.

The relevant clang code can be found here:
fe2119a7b0/clang/lib/CodeGen/Targets/PPC.cpp (L878-L884)
fe2119a7b0/clang/lib/CodeGen/Targets/PPC.cpp (L780-L784)

I think the corresponding psABI wording is this:

> Fixed size aggregates and unions passed by value are mapped to as
> many doublewords of the parameter save area as the value uses in
> memory. Aggregrates and unions are aligned according to their
> alignment requirements. This may result in doublewords being
> skipped for alignment.

In particular the last sentence.

Fixes https://github.com/rust-lang/rust/issues/122767.
2024-04-08 11:15:36 +09:00
bors
4e431fad67 Auto merge of #123561 - saethlin:str-unchecked-sub-index, r=scottmcm
Use unchecked_sub in str indexing

https://github.com/rust-lang/rust/pull/108763 applied this logic to indexing for slices, but of course `str` has its own separate impl.

Found this by skimming over the codegen for https://github.com/oxidecomputer/hubris/; their dist builds enable overflow checks so the lack of `unchecked_sub` was producing an impossible-to-hit overflow check and also inhibiting some inlining.

r? scottmcm
2024-04-07 12:49:15 +00:00
bors
0e3235f85b Auto merge of #123555 - DianQK:update-llvm-18, r=cuviper
Update to LLVM 18.1.3

Fixes #122805.

This should work on all targets: https://rust.godbolt.org/z/svW8ha31z.

r? `@cuviper`
2024-04-07 06:33:58 +00:00
DianQK
5acfe772fa
Add the test case for #122805 2024-04-07 13:01:54 +08:00
Scott McMurray
00bd24766f Don't emit divide-by-zero panic paths in StepBy::len
I happened to notice today that there's actually two such calls emitted in the assembly: <https://rust.godbolt.org/z/1Wbbd3Ts6>

Since they're impossible, hopefully telling LLVM that will also help optimizations elsewhere.
2024-04-06 11:37:57 -07:00
Ben Kimock
712aab72df Use unchecked_sub in str indexing 2024-04-06 14:09:03 -04:00
Ben Kimock
a7912cb421 Put checks that detect UB under their own flag below debug_assertions 2024-04-06 11:21:47 -04:00
Matthias Krüger
ad3df4919d
Rollup merge of #123525 - maurer:no-id-dyn2, r=compiler-errors
CFI: Don't rewrite ty::Dynamic directly

Now that we're using a type folder, the arguments in predicates are processed automatically - we don't need to descend manually.

We also want to keep projection clauses around, and this does so.

r? `@compiler-errors`
2024-04-06 08:56:35 +02:00
Matthew Maurer
5083378f16 CFI: Don't rewrite ty::Dynamic directly
Now that we're using a type folder, the arguments in predicates are
processed automatically - we don't need to descend manually.

We also want to keep projection clauses around, and this does so.
2024-04-05 23:58:15 +00:00
Guillaume Gomez
5ceac29123
Rollup merge of #123487 - rcvalle:rust-cfi-restore-typeid-for-instance, r=compiler-errors
CFI: Restore typeid_for_instance default behavior

Restore typeid_for_instance default behavior of performing self type erasure, since it's the most common case and what it does most of the time. Using concrete self (or not performing self type erasure) is for assigning a secondary type id, and secondary type ids are only assigned when they're unique and to methods, and also are only tested for when methods are used as function pointers.
2024-04-05 22:33:27 +02:00
Ramon de C Valle
2498a9d464 CFI: Restore typeid_for_instance default behavior
Restore typeid_for_instance default behavior of performing self type
erasure, since it's the most common case and what it does most of the
time. Using concrete self (or not performing self type erasure) is for
assigning a secondary type id, and secondary type ids are only assigned
when they're unique and to methods, and also are only tested for when
methods are used as function pointers.
2024-04-04 21:19:33 -07:00
许杰友 Jieyou Xu (Joe)
476156aedf
Port issue-7349 to a codegen test 2024-04-04 21:59:08 +01:00
bors
29fe618f75 Auto merge of #123052 - maurer:addr-taken, r=compiler-errors
CFI: Support function pointers for trait methods

Adds support for both CFI and KCFI for function pointers to trait methods by attaching both concrete and abstract types to functions.

KCFI does this through generation of a `ReifyShim` on any function pointer for a method that could go into a vtable, and keeping this separate from `ReifyShim`s that are *intended* for vtable us by setting a `ReifyReason` on them.

CFI does this by setting both the concrete and abstract type on every instance.

This should land after #123024 or a similar PR, as it diverges the implementation of CFI vs KCFI.

r? `@compiler-errors`
2024-04-04 06:40:30 +00:00
Matthias Krüger
bc8415b9e6
Rollup merge of #122619 - erikdesjardins:cast, r=compiler-errors
Fix some unsoundness with PassMode::Cast ABI

Fixes #122617

Reviewable commit-by-commit. More info in each commit message.
2024-04-03 22:11:00 +02:00
bors
76cf07d5df Auto merge of #122225 - DianQK:nits-120268, r=cjgillot
Rename `UninhabitedEnumBranching` to `UnreachableEnumBranching`

Per [#120268](https://github.com/rust-lang/rust/pull/120268#discussion_r1517492060), I rename `UninhabitedEnumBranching` to `UnreachableEnumBranching` .

I solved some nits to add some comments.

I adjusted the workaround restrictions. This should be useful for `a <= b` and `if let Some/Ok(v)`. For enum with few variants, `early-tailduplication` should not cause compile time overhead.

r? RalfJung
2024-04-03 06:22:23 +00:00
bors
88c2f4f5f5 Auto merge of #123385 - matthiaskrgr:rollup-v69vjbn, r=matthiaskrgr
Rollup of 8 pull requests

Successful merges:

 - #123198 (Add fn const BuildHasherDefault::new)
 - #123226 (De-LLVM the unchecked shifts [MCP#693])
 - #123302 (Make sure to insert `Sized` bound first into clauses list)
 - #123348 (rustdoc: add a couple of regression tests)
 - #123362 (Check that nested statics in thread locals are duplicated per thread.)
 - #123368 (CFI: Support non-general coroutines)
 - #123375 (rustdoc: synthetic auto trait impls: accept unresolved region vars for now)
 - #123378 (Update sysinfo to 0.30.8)

Failed merges:

 - #123349 (Fix capture analysis for by-move closure bodies)

r? `@ghost`
`@rustbot` modify labels: rollup
2024-04-02 21:23:53 +00:00
bors
a77322c16f Auto merge of #118310 - scottmcm:three-way-compare, r=davidtwco
Add `Ord::cmp` for primitives as a `BinOp` in MIR

Update: most of this OP was written months ago.  See https://github.com/rust-lang/rust/pull/118310#issuecomment-2016940014 below for where we got to recently that made it ready for review.

---

There are dozens of reasonable ways to implement `Ord::cmp` for integers using comparison, bit-ops, and branches.  Those differences are irrelevant at the rust level, however, so we can make things better by adding `BinOp::Cmp` at the MIR level:

1. Exactly how to implement it is left up to the backends, so LLVM can use whatever pattern its optimizer best recognizes and cranelift can use whichever pattern codegens the fastest.
2. By not inlining those details for every use of `cmp`, we drastically reduce the amount of MIR generated for `derive`d `PartialOrd`, while also making it more amenable to MIR-level optimizations.

Having extremely careful `if` ordering to μoptimize resource usage on broadwell (#63767) is great, but it really feels to me like libcore is the wrong place to put that logic.  Similarly, using subtraction [tricks](https://graphics.stanford.edu/~seander/bithacks.html#CopyIntegerSign) (#105840) is arguably even nicer, but depends on the optimizer understanding it (https://github.com/llvm/llvm-project/issues/73417) to be practical.  Or maybe [bitor is better than add](https://discourse.llvm.org/t/representing-in-ir/67369/2?u=scottmcm)?  But maybe only on a future version that [has `or disjoint` support](https://discourse.llvm.org/t/rfc-add-or-disjoint-flag/75036?u=scottmcm)?  And just because one of those forms happens to be good for LLVM, there's no guarantee that it'd be the same form that GCC or Cranelift would rather see -- especially given their very different optimizers.  Not to mention that if LLVM gets a spaceship intrinsic -- [which it should](https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/Suboptimal.20inlining.20in.20std.20function.20.60binary_search.60/near/404250586) -- we'll need at least a rustc intrinsic to be able to call it.

As for simplifying it in Rust, we now regularly inline `{integer}::partial_cmp`, but it's quite a large amount of IR.  The best way to see that is with 8811efa88b (diff-d134c32d028fbe2bf835fef2df9aca9d13332dd82284ff21ee7ebf717bfa4765R113) -- I added a new pre-codegen MIR test for a simple 3-tuple struct, and this PR change it from 36 locals and 26 basic blocks down to 24 locals and 8 basic blocks.  Even better, as soon as the construct-`Some`-then-match-it-in-same-BB noise is cleaned up, this'll expose the `Cmp == 0` branches clearly in MIR, so that an InstCombine (#105808) can simplify that to just a `BinOp::Eq` and thus fix some of our generated code perf issues.  (Tracking that through today's `if a < b { Less } else if a == b { Equal } else { Greater }` would be *much* harder.)

---

r? `@ghost`
But first I should check that perf is ok with this
~~...and my true nemesis, tidy.~~
2024-04-02 19:21:44 +00:00
Matthew Maurer
93c2bace58 CFI: Switch sense of type erasure flag
Previously, we had `NO_SELF_TYPE_ERASURE`, a negative configuration. Now
we have `ERASE_SELF_TYPE`, a positive configuration.
2024-04-02 18:24:44 +00:00
Scott McMurray
0601f0c66d De-LLVM the unchecked shifts [MCP#693]
This is just one part of the MCP, but it's the one that IMHO removes the most noise from the standard library code.

Seems net simpler this way, since MIR already supported heterogeneous shifts anyway, and thus it's not more work for backends than before.
2024-03-30 03:32:11 -07:00
bors
877d36b192 Auto merge of #122976 - caibear:optimize_reserve_for_push, r=cuviper
Remove len argument from RawVec::reserve_for_push

Removes `RawVec::reserve_for_push`'s `len` argument since it's always the same as capacity.
Also makes `Vec::insert` use `RawVec::reserve_for_push`.
2024-03-30 00:29:24 +00:00
Cai Bear
4500c83c62 Fix test. 2024-03-29 15:37:43 -07:00
bors
58dcd1fdb9 Auto merge of #123071 - rcvalle:rust-cfi-fix-method-fn-ptr-cast, r=compiler-errors
CFI: Fix methods as function pointer cast

Fix casting between methods and function pointers by assigning a secondary type id to methods with their concrete self so they can be used as function pointers.

This was split off from #116404.

cc `@compiler-errors` `@workingjubilee`
2024-03-29 09:04:05 +00:00
bors
db2f9759f4 Auto merge of #122671 - Mark-Simulacrum:const-panic-msg, r=Nilstrieb
Codegen const panic messages as function calls

This skips emitting extra arguments at every callsite (of which there
can be many). For a librustc_driver build with overflow checks enabled,
this cuts 0.7MB from the resulting shared library (see [perf]).

A sample improvement from nightly:

```
        leaq    str.0(%rip), %rdi
        leaq    .Lalloc_d6aeb8e2aa19de39a7f0e861c998af13(%rip), %rdx
        movl    $25, %esi
        callq   *_ZN4core9panicking5panic17h17cabb89c5bcc999E@GOTPCREL(%rip)
```

to this PR:

```
        leaq    .Lalloc_d6aeb8e2aa19de39a7f0e861c998af13(%rip), %rdi
        callq   *_RNvNtNtCsduqIKoij8JB_4core9panicking11panic_const23panic_const_div_by_zero@GOTPCREL(%rip)
```

[perf]: https://perf.rust-lang.org/compare.html?start=a7e4de13c1785819f4d61da41f6704ed69d5f203&end=64fbb4f0b2d621ff46d559d1e9f5ad89a8d7789b&stat=instructions:u
2024-03-29 00:24:01 +00:00
DianQK
ec359f7d9f
Restore the test checks for wider_reduce_into_iter
The current minimum support is for LLVM 17.
2024-03-28 21:28:45 +08:00
Ramon de C Valle
8e6b4e91b6 CFI: Fix methods as function pointer cast
Fix casting between methods and function pointers by assigning a
secondary type id to methods with their concrete self so they can be
used as function pointers.
2024-03-27 16:19:17 -07:00
Matthias Krüger
6464e5b78c
Rollup merge of #123075 - rcvalle:rust-cfi-fix-drop-drop-in-place, r=compiler-errors
CFI: Fix drop and drop_in_place

Fix drop and drop_in_place by transforming self of drop and drop_in_place methods into a Drop trait objects.

This was split off from https://github.com/rust-lang/rust/pull/116404.

cc `@compiler-errors` `@workingjubilee`
2024-03-27 23:27:22 +01:00
Ramon de C Valle
0b860818e6 CFI: Fix drop and drop_in_place
Fix drop and drop_in_place by transforming self of drop and
drop_in_place methods into Drop trait objects.
2024-03-27 12:52:14 -07:00
clubby789
b500693ad7 Don't emit load metadata in debug mode 2024-03-25 18:32:45 +00:00
Scott McMurray
3da115a93b Add+Use mir::BinOp::Cmp 2024-03-23 23:23:41 -07:00
Jubilee
b9b65f816d
Rollup merge of #122875 - maurer:cfi-transparent-termination, r=workingjubilee
CFI: Support self_cell-like recursion

Current `transform_ty` attempts to avoid cycles when normalizing `#[repr(transparent)]` types to their interior, but runs afoul of this pattern used in `self_cell`:

```
struct X<T> {
  x: u8,
  p: PhantomData<T>,
}

 #[repr(transparent)]
struct Y(X<Y>);
```

When attempting to normalize Y, it will still cycle indefinitely. By using a types-visited list, this will instead get expanded exactly one layer deep to X<Y>, and then stop, not attempting to normalize `Y` any further.

This PR was split off from #121962 as part of fixing the larger vtable compatibility issues.

r? ``````@workingjubilee``````
2024-03-23 22:59:42 -07:00
bors
d6eb0f5a09 Auto merge of #122582 - scottmcm:swap-intrinsic-v2, r=oli-obk
Let codegen decide when to `mem::swap` with immediates

Making `libcore` decide this is silly; the backend has so much better information about when it's a good idea.

Thus this PR introduces a new `typed_swap` intrinsic with a fallback body, and replaces that fallback implementation when swapping immediates or scalar pairs.

r? oli-obk

Replaces #111744, and means we'll never need more libs PRs like #111803 or #107140
2024-03-23 13:57:55 +00:00
Ben Kimock
6b794f6c80 Add the missing inttoptr when we ptrtoint in ptr atomics 2024-03-23 00:07:02 -04:00
Matthew Maurer
dec36c3d6e CFI: Support self_cell-like recursion
Current `transform_ty` attempts to avoid cycles when normalizing
`#[repr(transparent)]` types to their interior, but runs afoul of this
pattern used in `self_cell`:

```
struct X<T> {
  x: u8,
  p: PhantomData<T>,
}

 #[repr(transparent)]
struct Y(X<Y>);
```

When attempting to normalize Y, it will still cycle indefinitely. By
using a types-visited list, this will instead get expanded exactly
one layer deep to X<Y>, and then stop, not attempting to normalize `Y`
any further.
2024-03-22 23:02:05 +00:00
Mark Rousskov
00f4daa276 Codegen const panic messages as function calls
This skips emitting extra arguments at every callsite (of which there
can be many). For a librustc_driver build with overflow checks enabled,
this cuts 0.7MB from the resulting binary.
2024-03-22 09:55:50 -04:00
bors
7762adccb2 Auto merge of #122456 - maurer:cfi-nonpassed, r=workingjubilee
CFI: Skip non-passed arguments

Rust will occasionally rely on fn((), X) -> Y being compatible with fn(X) -> Y, since () is a non-passed argument. Relax CFI by choosing not to encode non-passed arguments.

This PR was split off from #121962 as part of fixing the larger vtable compatibility issues.

r? `@workingjubilee`
2024-03-22 06:09:40 +00:00
Matthew Maurer
f2f0d255df CFI: Skip non-passed arguments
Rust will occasionally rely on fn((), X) -> Y being compatible with
fn(X) -> Y, since () is a non-passed argument. Relax CFI by choosing not
to encode non-passed arguments.
2024-03-21 22:26:26 +00:00
clubby789
5f254d8b66 Remove SpecOptionPartialEq 2024-03-19 16:32:01 +00:00
bors
148a41c6b5 Auto merge of #122375 - rcvalle:rust-cfi-break-tests-into-smaller-files, r=compiler-errors
CFI: Break tests into smaller files

Break type metadata identifiers tests into smaller set of tests/files, and move CFI (and KCFI) codegen tests to a cfi (and kcfi) subdirectory,
2024-03-19 02:17:52 +00:00
Scott McMurray
6d2cb39ac5 Stop whining, tidy 2024-03-17 12:51:58 -07:00
Scott McMurray
7d537106a1 Let codegen decide when to mem::swap with immediates
Making `libcore` decide this is silly; the backend has so much better information about when it's a good idea.

So introduce a new `typed_swap` intrinsic with a fallback body, but replace that implementation for immediates and scalar pairs.
2024-03-17 11:59:18 -07:00
Josh Stone
d9132de4ab Remove an obsolete ignore-llvm-version 2024-03-17 10:52:00 -07:00
Erik Desjardins
dec81ac223 disable crashing test on sparc 2024-03-17 13:40:27 -04:00
Josh Stone
29430554f6 Update the minimum external LLVM to 17 2024-03-17 10:11:04 -07:00
Erik Desjardins
8d5fd94e62 add tests for PassMode::Cast fixes
Tests added in cast-target-abi.rs, covering the single element, array,
and prefix cases in `CastTarget::llvm_type`, and the Rust-is-larger/smaller
cases in the Rust<->ABI copying code.

ffi-out-of-bounds-loads.rs was overhauled to be runnable on any
platform. Its alignment also increases due to the removal of a `min` in
the previous commit; this was probably an insufficient workaround for
this issue or similar. The higher alignment is fine, since the alloca is
actually aligned to 8 bytes, as the test checks now confirm.
2024-03-17 00:39:21 -04:00
bors
c563f2ee79 Auto merge of #122371 - oli-obk:visit_nested_body, r=tmiasko
Stop walking the bodies of statics for reachability, and evaluate them instead

cc `@saethlin` `@RalfJung`

cc #119214

This reuses the `DefIdVisitor` from `rustc_privacy`, because they basically try to do the same thing.

This PR's changes can probably be extended to constants, too, but let's tackle that separately, it's likely more involved.
2024-03-16 04:35:02 +00:00
Matthias Krüger
722514f466
Rollup merge of #122212 - erikdesjardins:byval-align2, r=wesleywiser
Copy byval argument to alloca if alignment is insufficient

Fixes #122211

"Ignore whitespace" recommended.
2024-03-14 20:00:18 +01:00
Oli Scherer
8332b47cae Stop walking the bodies of statics for reachability, and evaluate them instead 2024-03-14 14:10:45 +00:00
Oli Scherer
54d83beb38 Add test 2024-03-14 14:10:45 +00:00
Ramon de C Valle
6bd85c4de4 CFI: Break tests into smaller files
Break type metadata identifiers tests into smaller set of tests/files,
and move CFI (and KCFI) codegen tests to a cfi (and kcfi) subdirectory.
2024-03-14 00:56:29 -07:00
bors
3cbb93223f Auto merge of #121668 - erikdesjardins:commonprim, r=scottmcm,oli-obk
Represent `Result<usize, Box<T>>` as ScalarPair(i64, ptr)

This allows types like `Result<usize, std::io::Error>` (and integers of differing sign, e.g. `Result<u64, i64>`) to be passed in a pair of registers instead of through memory, like `Result<u64, u64>` or `Result<Box<T>, Box<U>>` are today.

Fixes #97540.

r? `@ghost`
2024-03-13 15:25:35 +00:00
DianQK
f8656ef6e9
Update unreachable_enum_default_branch.rs 2024-03-13 22:35:11 +08:00
Erik Desjardins
9f55200a42 refine common_prim test
Co-authored-by: Scott McMurray <scottmcm@users.noreply.github.com>
2024-03-13 01:17:15 -04:00
Ben Kimock
81d630453b Avoid lowering code under dead SwitchInt targets 2024-03-12 19:01:04 -04:00
bors
0fa7feaf3f Auto merge of #121282 - saethlin:gep-null-means-no-provenance, r=scottmcm
Lower transmutes from int to pointer type as gep on null

I thought of this while looking at https://github.com/rust-lang/rust/pull/121242. See that PR's description for why this lowering is preferable.

The UI test that's being changed here crashes without changing the transmutes into casts. Based on that, this PR should not be merged without a crater build-and-test run.
2024-03-12 04:11:37 +00:00
bors
dc2ffa4054 Auto merge of #122036 - alexcrichton:test-wasm-with-wasi, r=oli-obk
Test wasm32-wasip1 in CI, not wasm32-unknown-unknown

This commit changes CI to no longer test the `wasm32-unknown-unknown` target and instead test the `wasm32-wasip1` target. There was some discussion of this in a [Zulip thread], and the motivations for this PR are:

* Runtime failures on `wasm32-unknown-unknown` print nothing, meaning all you get is "something failed". In contrast `wasm32-wasip1` can print to stdout/stderr.

* The unknown-unknown target is missing lots of pieces of libstd, and while `wasm32-wasip1` is also missing some pieces (e.g. threads) it's missing fewer pieces. This means that many more tests can be run.

Overall my hope is to improve the debuggability of wasm failures on CI and ideally be a bit less of a maintenance burden.

This commit specifically removes the testing of `wasm32-unknown-unknown` and replaces it with testing of `wasm32-wasip1`. Along the way there were a number of other archiectural changes made as well, including:

* A new `target.*.runtool` option can now be specified in `config.toml` which is passed as `--runtool` to `compiletest`. This is used to reimplement execution of WebAssembly in a less-wasm-specific fashion.

* The default value for `runtool` is an ambiently located WebAssembly runtime found on the system, if any. I've implemented logic for Wasmtime.

* Existing testing support for `wasm32-unknown-unknown` and Emscripten has been removed. I'm not aware of Emscripten testing being run any time recently and otherwise `wasm32-wasip1` is in theory the focus now.

* I've added a new `//@ needs-threads` directive for `compiletest` and classified a bunch of wasm-ignored tests as needing threads. In theory these tests can run on `wasm32-wasi-preview1-threads`, for example.

* I've tried to audit all existing tests that are either `ignore-emscripten` or `ignore-wasm*`. Many now run on `wasm32-wasip1` due to being able to emit error messages, for example. Many are updated with comments as to why they can't run as well.

* The `compiletest` output matching for `wasm32-wasip1` automatically uses "match a subset" mode implemented in `compiletest`. This is because WebAssembly runtimes often add extra information on failure, such as the `unreachable` instruction in `panic!`, which isn't able to be matched against the golden output from native platforms.

* I've ported most existing `run-make` tests that use custom Node.js wrapper scripts to the new run-make-based-in-Rust infrastructure. To do this I added `wasmparser` as a dependency of `run-make-support` for the various wasm tests to use that parse wasm files. The one test that executed WebAssembly now uses `wasmtime`-the-CLI to execute the test instead. I have not ported over an exception-handling test as Wasmtime doesn't implement this yet.

* I've updated the `test` crate to print out timing information for WASI targets as it can do that (gets a previously ignored test now passing).

* The `test-various` image now builds a WASI sysroot for the WASI target and additionally downloads a fixed release of Wasmtime, currently the latest one at 18.0.2, and uses that for testing.

[Zulip thread]: https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/Have.20wasm.20tests.20ever.20caused.20problems.20on.20CI.3F/near/424317944
2024-03-12 00:03:54 +00:00
Ben Kimock
2eb9c6d49e Lower transmutes from int to pointer type as gep on null 2024-03-11 18:19:17 -04:00
Alex Crichton
cf6d6050f7 Update test directives for wasm32-wasip1
* The WASI targets deal with the `main` symbol a bit differently than
  native so some `codegen` and `assembly` tests have been ignored.
* All `ignore-emscripten` directives have been updated to
  `ignore-wasm32` to be more clear that all wasm targets are ignored and
  it's not just Emscripten.
* Most `ignore-wasm32-bare` directives are now gone.
* Some ignore directives for wasm were switched to `needs-unwind`
  instead.
* Many `ignore-wasm32*` directives are removed as the tests work with
  WASI as opposed to `wasm32-unknown-unknown`.
2024-03-11 09:36:35 -07:00
Jubilee
028e2600c9
Rollup merge of #122320 - erikdesjardins:vtable, r=nikic
Use ptradd for vtable indexing

Extension of #121665.

After this, the only remaining usages of GEP are [this](cd81f5b27e/compiler/rustc_codegen_llvm/src/intrinsic.rs (L909-L920)) kinda janky Emscription EH code, which I'll change in a future PR, and array indexing / pointer offsets, where there isn't yet a canonical `ptradd` form. (Out of curiosity I tried converting the latter to `ptradd(ptr, mul(size, index))`, but that causes codegen regressions right now.)

r? `@nikic`
2024-03-11 09:29:38 -07:00
Erik Desjardins
207fe38630 copy byval argument to alloca if alignment is insufficient 2024-03-11 09:38:54 -04:00
bors
a6d93acf5f Auto merge of #122050 - erikdesjardins:sret, r=nikic
Stop using LLVM struct types for byval/sret

For `byval` and `sret`, the type has no semantic meaning, only the size matters\*†. Using `[N x i8]` is a more direct way to specify that we want `N` bytes, and avoids relying on LLVM's struct layout.

\*: The alignment would matter, if we didn't explicitly specify it. From what I can tell, we always specified the alignment for `sret`; for `byval`, we didn't until #112157.

†: For `byval`, the hidden copy may be impacted by padding in the LLVM struct type, i.e. padding bytes may not be copied. (I'm not sure if this is done today, but I think it would be legal.) But we manually pad our LLVM struct types specifically to avoid there ever being LLVM-visible padding, so that shouldn't be an issue.

Split out from #121577.

r? `@nikic`
2024-03-11 04:45:27 +00:00
Erik Desjardins
a7cd803d02 use ptradd for vtable indexing
Like field offsets, these are always constant.
2024-03-10 22:47:30 -04:00
Erik Desjardins
f18c2f83e9 add -O to some tests which depend on attributes being added 2024-03-10 16:04:12 -04:00
Matthias Krüger
e8e41877a2
Rollup merge of #121642 - TimNN:test-v0, r=Mark-Simulacrum
Update a test to support Symbol Mangling V0

Note that since this is a symbol from `std`, overriding the symbol mangling version via the `compile-flags` directive does not work.
2024-03-10 10:58:15 +01:00
Erik Desjardins
8fdd5e044b convert codegen/repr/transparent-* tests to no_core, fix discrepancies 2024-03-09 23:16:02 -05:00
Guillaume Boisseau
e3c0158788
Rollup merge of #120504 - kornelski:try_with_capacity, r=Amanieu
Vec::try_with_capacity

Related to #91913

Implements try_with_capacity for `Vec`, `VecDeque`, and `String`. I can follow it up with more collections if desired.

`Vec::try_with_capacity()` is functionally equivalent to the current stable:

```rust
let mut v = Vec::new();
v.try_reserve_exact(n)?
```

However, `try_reserve` calls non-inlined `finish_grow`, which requires old and new `Layout`, and is designed to reallocate memory. There is benefit to using `try_with_capacity`, besides syntax convenience, because it generates much smaller code at the call site with a direct call to the allocator. There's codegen test included.

It's also a very desirable functionality for users of `no_global_oom_handling` (Rust-for-Linux), since it makes a very commonly used function available in that environment (`with_capacity` is used much more frequently than all `(try_)reserve(_exact)`).
2024-03-09 21:40:06 +01:00
bors
1b2c53a15d Auto merge of #122182 - matthiaskrgr:rollup-gzimi4c, r=matthiaskrgr
Rollup of 8 pull requests

Successful merges:

 - #118623 (Improve std::fs::read_to_string example)
 - #119365 (Add asm goto support to `asm!`)
 - #120608 (Docs for std::ptr::slice_from_raw_parts)
 - #121832 (Add new Tier-3 target: `loongarch64-unknown-linux-musl`)
 - #121938 (Fix quadratic behavior of repeated vectored writes)
 - #122099 (Add  `#[inline]` to `BTreeMap::new` constructor)
 - #122103 (Make TAITs and ATPITs capture late-bound lifetimes in scope)
 - #122143 (PassWrapper: update for llvm/llvm-project@a331937197)

Failed merges:

 - #122076 (Tweak the way we protect in-place function arguments in interpreters)

r? `@ghost`
`@rustbot` modify labels: rollup
2024-03-08 09:34:05 +00:00
Matthias Krüger
d774fbea7c
Rollup merge of #119365 - nbdd0121:asm-goto, r=Amanieu
Add asm goto support to `asm!`

Tracking issue: #119364

This PR implements asm-goto support, using the syntax described in "future possibilities" section of [RFC2873](https://rust-lang.github.io/rfcs/2873-inline-asm.html#asm-goto).

Currently I have only implemented the `label` part, not the `fallthrough` part (i.e. fallthrough is implicit). This doesn't reduce the expressive though, since you can use label-break to get arbitrary control flow or simply set a value and rely on jump threading optimisation to get the desired control flow. I can add that later if deemed necessary.

r? ``@Amanieu``
cc ``@ojeda``
2024-03-08 08:19:17 +01:00
bors
14fbc3c005 Auto merge of #120268 - DianQK:otherwise_is_last_variant_switchs, r=oli-obk
Replace the default branch with an unreachable branch If it is the last variant

Fixes #119520. Fixes #110097.

LLVM currently has limited ability to eliminate dead branches in switches, even with the patch of https://github.com/llvm/llvm-project/issues/73446.

The main reasons are as follows:

- Additional costs are required to calculate the range of values, and there exist many scenarios that cannot be analyzed accurately.
- Matching values by bitwise calculation cannot handle odd branches, nor can it handle values like `-1, 0, 1`. See [SimplifyCFG.cpp#L5424](https://github.com/llvm/llvm-project/blob/llvmorg-17.0.6/llvm/lib/Transforms/Utils/SimplifyCFG.cpp#L5424) and https://llvm.godbolt.org/z/qYMqhvMa8
- The current range information is continuous, even if the metadata for the range is submitted. See [ConstantRange.cpp#L1869-L1870](https://github.com/llvm/llvm-project/blob/llvmorg-17.0.6/llvm/lib/IR/ConstantRange.cpp#L1869-L1870).
- The metadata of the range may be lost in passes such as SROA. See https://rust.godbolt.org/z/e7f87vKMK.

Although we can make improvements, I think it would be more appropriate to put this issue to rustc first. After all, we can easily know the possible values.

Note that we've currently found a slow compilation problem in the presence of unreachable branches. See
https://github.com/llvm/llvm-project/issues/78578.

r? compiler
2024-03-08 07:18:17 +00:00
bors
79d246112d Auto merge of #122048 - erikdesjardins:inbounds, r=oli-obk
Use GEP inbounds for ZST and DST field offsets

ZST field offsets have been non-`inbounds` since I made [this old layout change](https://github.com/rust-lang/rust/pull/73453/files#diff-160634de1c336f2cf325ff95b312777326f1ab29fec9b9b21d5ee9aae215ecf5). Before that, they would have been `inbounds` due to using `struct_gep`. Using `inbounds` for ZSTs likely doesn't matter for performance, but I'd like to remove the special case.

DST field offsets have been non-`inbounds` since the alignment-aware DST field offset computation was first [implemented](a2557d472e (diff-04fd352da30ca186fe0bb71cc81a503d1eb8a02ca17a3769e1b95981cd20964aR1188)) in 1.6 (back then `GEPi()` would be used for `inbounds`), but I don't think there was any reason for it.

Split out from #121577 / #121665.

r? `@oli-obk`

cc `@RalfJung` -- is there some weird situation where field offsets can't be `inbounds`?

Note that it's fine for `inbounds` offsets to be one-past-the-end, so it's okay even if there's a ZST as the last field in the layout:

> The base pointer has an in bounds address of an allocated object, which means that it points into an allocated object, or to its end. [(link)](https://llvm.org/docs/LangRef.html#getelementptr-instruction)

For https://github.com/rust-lang/unsafe-code-guidelines/issues/93, zero-offset GEP is (now) always `inbounds`:

> Note that getelementptr with all-zero indices is always considered to be inbounds, even if the base pointer does not point to an allocated object. [(link)](https://llvm.org/docs/LangRef.html#getelementptr-instruction)
2024-03-08 02:01:51 +00:00
DianQK
08ae8380ce
Replace the default branch with an unreachable branch If it is the last variant 2024-03-07 22:58:51 +08:00
Erik Desjardins
e349900339 add test for extern type 2024-03-06 19:53:45 -05:00
Erik Desjardins
5ccada66a2 make check lines for int/ptr common prim test more permissive
It seems that LLVM 17 doesn't fully optimize out unwrap_unchecked.

We can just loosen the check lines to account for this, since we don't
really care about the exact instructions, we just want to make sure that
inttoptr/ptrtoint aren't used for Box.
2024-03-06 19:36:09 -05:00
Alex Crichton
75fa9f6dec compiletest: Add a //@ needs-threads directive
This commit is extracted from #122036 and adds a new directive to the
`compiletest` test runner, `//@ needs-threads`. This is intended to
capture the need that a target must implement threading to execute a
specific test, typically one that uses `std::thread`. This is primarily
done for WebAssembly targets which currently do not have threads by
default. This enables transitioning a lot of `//@ ignore-wasm*`-style
ignores into a more self-documenting `//@ needs-threads` directive.
Additionally the `wasm32-wasi-preview1-threads` target, for example,
does actually have threads, but isn't tested in CI at this time. This
change enables running these tests for that target, but not other wasm
targets.
2024-03-06 12:35:07 -08:00
Erik Desjardins
96a72676d1 use [N x i8] for byval/sret types
This avoids depending on LLVM's struct types to determine the size of
the byval/sret slot.
2024-03-05 18:54:45 -05:00
Ralf Jung
f391c0793b only set noalias on Box with the global allocator 2024-03-05 15:03:33 +01:00
Erik Desjardins
8ebd307d2a use GEP inbounds for ZST and DST field offsets
For the former, it's fine for `inbounds` offsets to be one-past-the-end,
so it's okay even if the ZST is the last field in the layout:

> The base pointer has an in bounds address of an allocated object,
> which means that it points into an allocated object, or to its end.

https://llvm.org/docs/LangRef.html#getelementptr-instruction

For the latter, even DST fields must always be inside the layout
(or to its end for ZSTs), so using inbounds is also fine there.
2024-03-04 09:32:33 -05:00
bors
70aa0b86c0 Auto merge of #121665 - erikdesjardins:ptradd, r=nikic
Always generate GEP i8 / ptradd for struct offsets

This implements #98615, and goes a bit further to remove `struct_gep` entirely.

Upstream LLVM is in the beginning stages of [migrating to `ptradd`](https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699). LLVM 19 will [canonicalize](https://github.com/llvm/llvm-project/pull/68882) all constant-offset GEPs to i8, which has roughly the same effect as this change.

Fixes #121719.

Split out from #121577.

r? `@nikic`
2024-03-03 22:21:53 +00:00
Ramon de C Valle
dee4e02102 Add initial support for DataFlowSanitizer
Adds initial support for DataFlowSanitizer to the Rust compiler. It
currently supports `-Zsanitizer-dataflow-abilist`. Additional options
for it can be passed to LLVM command line argument processor via LLVM
arguments using `llvm-args` codegen option (e.g.,
`-Cllvm-args=-dfsan-combine-pointer-labels-on-load=false`).
2024-03-01 18:50:40 -08:00
Kornel
78fb977d6b try_with_capacity for Vec, VecDeque, String
#91913
2024-03-01 18:24:02 +00:00
Guillaume Gomez
36bd9ef5a8
Rollup merge of #120820 - CKingX:cpu-base-minimum, r=petrochenkov,ChrisDenton
Enable CMPXCHG16B, SSE3, SAHF/LAHF and 128-bit Atomics (in nightly) in Windows x64

As Rust plans to set Windows 10 as the minimum supported OS for target x86_64-pc-windows-msvc, I have added the cmpxchg16b and sse3 feature. Windows 10 requires CMPXCHG16B, LAHF/SAHF, and PrefetchW as stated in the requirements [here](https://download.microsoft.com/download/c/1/5/c150e1ca-4a55-4a7e-94c5-bfc8c2e785c5/Windows%2010%20Minimum%20Hardware%20Requirements.pdf). Furthermore, CPUs that meet these requirements also have SSE3 ([see](https://walbourn.github.io/directxmath-sse3-and-ssse3/))
2024-02-29 17:08:36 +01:00
Guillaume Gomez
b2c3279984
Rollup merge of #121700 - rcvalle:rust-cfi-dont-compress-user-defined-builtin-types, r=compiler-errors
CFI: Don't compress user-defined builtin types

Doesn't compress user-defined builtin types (see https://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangling-builtin and https://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangling-compression).
2024-02-29 14:33:51 +01:00
Erik Desjardins
401651015d test merging of multiple match branches that access fields of the same offset 2024-02-27 23:14:36 -05:00
Erik Desjardins
c1017d4828 use non-inbounds GEP for ZSTs, add fixmes 2024-02-27 23:00:54 -05:00
Ramon de C Valle
8f7b921f52 CFI: Don't compress user-defined builtin types
Doesn't compress user-defined builtin types (see
https://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangling-builtin and
https://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangling-compression).
2024-02-27 12:23:48 -08:00
Erik Desjardins
4dabbcb23b allow using scalarpair with a common prim of ptr/ptr-sized-int 2024-02-27 00:09:12 -05:00
Erik Desjardins
123015e722 always use gep inbounds i8 (ptradd) for field offsets 2024-02-26 22:28:09 -05:00
bors
71ffdf7ff7 Auto merge of #121655 - matthiaskrgr:rollup-qpx3kks, r=matthiaskrgr
Rollup of 4 pull requests

Successful merges:

 - #121598 (rename 'try' intrinsic to 'catch_unwind')
 - #121639 (Update books)
 - #121648 (Update Vec and String `{from,into}_raw_parts`-family docs)
 - #121651 (Properly emit `expected ;` on `#[attr] expr`)

r? `@ghost`
`@rustbot` modify labels: rollup
2024-02-27 00:55:14 +00:00
Matthias Krüger
d95c321062
Rollup merge of #121598 - RalfJung:catch_unwind, r=oli-obk
rename 'try' intrinsic to 'catch_unwind'

The intrinsic has nothing to do with `try` blocks, and corresponds to the stable `catch_unwind` function, so this makes a lot more sense IMO.

Also rename Miri's special function while we are at it, to reflect the level of abstraction it works on: it's an unwinding mechanism, on which Rust implements panics.
2024-02-27 00:40:00 +01:00
bors
5c786a7fe3 Auto merge of #121516 - RalfJung:platform-intrinsics-begone, r=oli-obk
remove platform-intrinsics ABI; make SIMD intrinsics be regular intrinsics

`@Amanieu` `@workingjubilee` I don't think there is any reason these need to be "special"? The [original RFC](https://rust-lang.github.io/rfcs/1199-simd-infrastructure.html) indicated eventually making them stable, but I think that is no longer the plan, so seems to me like we can clean this up a bit.

Blocked on https://github.com/rust-lang/stdarch/pull/1538, https://github.com/rust-lang/rust/pull/121542.
2024-02-26 22:24:16 +00:00
Tim Neumann
05a6f65d81 Update a test to support Symbol Mangling V0 2024-02-26 18:12:07 +01:00
Ralf Jung
b4ca582b89 rename 'try' intrinsic to 'catch_unwind' 2024-02-26 11:10:18 +01:00
Guillaume Gomez
0e08be5360
Rollup merge of #120656 - Zalathar:filecheck-flags, r=wesleywiser
Allow tests to specify a `//@ filecheck-flags:` header

This allows individual codegen/assembly/mir-opt tests to pass extra flags to the LLVM `filecheck` tool as needed.

---

The original motivation was noticing that `tests/run-make/instrument-coverage` was very close to being an ordinary codegen test, except that it needs some extra logic to set up platform-specific variables to be passed into filecheck.

I then saw the comment in `verify_with_filecheck` indicating that a `filecheck-flags` header might be useful for other purposes as well.
2024-02-26 10:27:41 +01:00
Markus Reiter
b2fbb8a053
Use generic NonZero in tests. 2024-02-25 12:03:48 +01:00
Ralf Jung
c1d0e489e5 fix use of platform_intrinsics in tests 2024-02-25 08:15:44 +01:00
bors
89d8e3116c Auto merge of #120650 - clubby789:switchint-const, r=saethlin
Use `br` instead of a conditional when switching on a constant boolean

r? `@ghost`
2024-02-25 01:27:44 +00:00
Gary Guo
4677a71369 Add tests for asm goto 2024-02-24 19:49:16 +00:00
Ben Kimock
2f3c0b9859 Ignore less tests in debug builds 2024-02-23 18:04:01 -05:00
clubby789
7159aed51e Use br instead of conditional when branching on constant 2024-02-23 10:52:55 +00:00
Zalathar
e56cc8408d Remove unhelpful DEFINE_INTERNAL from filecheck flags
This define was copied over from the run-make version of the test, but doesn't
seem to serve any useful purpose.
2024-02-23 11:29:01 +11:00
Zalathar
0c19c632ab Convert tests/run-make/instrument-coverage to an ordinary codegen test
This test was already very close to being an ordinary codegen test, except that
it needed some extra logic to set a few variables based on (target) platform
characteristics.

Now that we have support for `//@ filecheck-flags:`, we can instead set those
variables using the normal test revisions mechanism.
2024-02-23 11:28:59 +11:00
Zalathar
c1889b549b Move existing coverage codegen tests into a subdirectory
This makes room for migrating over `tests/run-make/instrument-coverage`,
without increasing the number of top-level items in the codegen test directory.
2024-02-23 11:28:09 +11:00
Zalathar
baec3076db Allow tests to specify a //@ filecheck-flags: header
Any flags specified here will be passed to LLVM's `filecheck` tool, in tests
that use that tool.
2024-02-23 11:28:06 +11:00
Zalathar
36f298c93d Add some simple meta-tests for the handling of filecheck flags 2024-02-23 11:27:38 +11:00
许杰友 Jieyou Xu (Joe)
6e48b96692
[AUTO_GENERATED] Migrate compiletest to use ui_test-style //@ directives 2024-02-22 16:04:04 +00:00
bors
52dba5ffe7 Auto merge of #121225 - RalfJung:simd-extract-insert-const-idx, r=oli-obk,Amanieu
require simd_insert, simd_extract indices to be constants

As discussed in https://github.com/rust-lang/rust/issues/77477 (see in particular [here](https://github.com/rust-lang/rust/issues/77477#issuecomment-703149102)). This PR doesn't touch codegen yet -- the first step is to ensure that the indices are always constants; the second step is to then make use of this fact in backends.

Blocked on https://github.com/rust-lang/stdarch/pull/1530 propagating to the rustc repo.
2024-02-22 09:59:41 +00:00
Ralf Jung
07b6240947 remove simd_reduce_{min,max}_nanless 2024-02-21 20:50:47 +01:00
bors
bb8b11e67d Auto merge of #120718 - saethlin:reasonable-fast-math, r=nnethercote
Add "algebraic" fast-math intrinsics, based on fast-math ops that cannot return poison

Setting all of LLVM's fast-math flags makes our fast-math intrinsics very dangerous, because some inputs are UB. This set of flags permits common algebraic transformations, but according to the [LangRef](https://llvm.org/docs/LangRef.html#fastmath), only the flags `nnan` (no nans) and `ninf` (no infs) can produce poison.

And this uses the algebraic float ops to fix https://github.com/rust-lang/rust/issues/120720

cc `@orlp`
2024-02-21 09:43:33 +00:00
Ben Kimock
cc73b71e8e Add "algebraic" versions of the fast-math intrinsics 2024-02-20 12:39:03 -05:00
Ralf Jung
e19f89b5ff delete a test that no longer makes sense 2024-02-20 08:37:47 +01:00
CKingX
2d25c3b369
Updated test to account for added previous features (thanks erikdesjardins!) 2024-02-19 21:59:13 -08:00
bors
158f00a1c5 Auto merge of #118264 - lukas-code:optimized-draining, r=the8472
Optimize `VecDeque::drain` for (half-)open ranges

The most common use cases of `VecDeque::drain` consume either the entire queue or elements from the front or back.[^1] This PR makes these operations faster by optimizing the generated code of the destructor of the drain:

* `.drain(..)` is now the same as `.clear()`.
* `.drain(n..)` is now (almost[^2]) the same as `.truncate(n)`.
* `.drain(..n)` is now an efficient "advance" function. This operation is not provided by a dedicated function and optimizing it is my main motivation for this PR.

Previously, all of these cases generated a function call to the destructor of the `DropGuard`, emitting a lot of unused machine code as well as unnecessary branches and loads/stores of stack variables.

There are no algorithmic changes in this PR, but it simplifies the code enough to allow LLVM to recognize the special cases and optimize accordingly. Most notably, it allows elimination of the rather large [`wrap_copy`] function.

Some [rudimentary microbenchmarks][benches] show a performance improvement of **~3x-4x** on my machine for the special cases and roughly equal performance for the general case.

Best reviewed commit by commit.

[^1]: source: GitHub code search: [full range `drain(..)` = 7.5k results][full], [from front `drain(..n)` = 3.2k results][front], [from back `drain(n..)` = 1.6k results][back], [from middle `drain(n..m)` = <500 results][middle]

[^2]: `.drain(0..)` and `.clear()` reset the head to 0, but `.truncate(0)` does not.

[full]: https://github.com/search?type=code&q=%2FVecDeque%28.%7C%5Cn%29%2B%5C.drain%5C%280%3F%5C.%5C.%5C%29%2F+lang%3ARust
[front]: https://github.com/search?type=code&q=%2FVecDeque%28.%7C%5Cn%29%2B%5C.drain%5C%280%3F%5C.%5C.%5B%5E%29%5D.*%5C%29%2F+lang%3ARust
[back]: https://github.com/search?type=code&q=%2FVecDeque%28.%7C%5Cn%29%2B%5C.drain%5C%28%5B%5E0%5D.*%5C.%5C.%5C%29%2F+lang%3ARust
[middle]: https://github.com/search?type=code&q=%2FVecDeque%28.%7C%5Cn%29%2B%5C.drain%5C%28%5B%5E0%5D.*%5C.%5C.%5B%5E%29%5D.*%5C%29%2F+lang%3ARust
[`wrap_copy`]: 4fd68eb47b/library/alloc/src/collections/vec_deque/mod.rs (L262-L391)
[benches]: https://gist.github.com/lukas-code/c97bd707d074c4cc31f241edbc7fd2a2

<details>
<summary>generated assembly</summary>

before:
```asm
clear:
	sub rsp, 40
	mov rax, qword ptr [rdi + 24]
	mov qword ptr [rdi + 24], 0
	mov qword ptr [rsp], rdi
	mov qword ptr [rsp + 8], rax
	xorps xmm0, xmm0
	movups xmmword ptr [rsp + 16], xmm0
	mov qword ptr [rsp + 32], rax
	test rax, rax
	je .LBB1_2
	mov rcx, qword ptr [rdi]
	mov rdx, qword ptr [rdi + 16]
	xor esi, esi
	cmp rdx, rcx
	cmovae rsi, rcx
	sub rdx, rsi
	mov rsi, rcx
	sub rsi, rdx
	lea rdi, [rdx + rax]
	cmp rsi, rax
	cmovb rdi, rcx
	sub rdi, rdx
	mov qword ptr [rsp + 16], rdi
	mov qword ptr [rsp + 32], 0
.LBB1_2:
	mov rdi, rsp
	call core::ptr::drop_in_place<<alloc::collections::vec_deque::drain::Drain<T,A> as core::ops::drop::Drop>::drop::DropGuard<i32,alloc::alloc::Global>>
	add rsp, 40
	ret

truncate:
	mov rax, qword ptr [rdi + 24]
	sub rax, rsi
	jbe .LBB2_2
	sub rsp, 40
	mov qword ptr [rdi + 24], rsi
	mov qword ptr [rsp], rdi
	mov qword ptr [rsp + 8], rax
	mov rcx, qword ptr [rdi]
	mov rdx, qword ptr [rdi + 16]
	add rdx, rsi
	xor edi, edi
	cmp rdx, rcx
	cmovae rdi, rcx
	mov qword ptr [rsp + 24], 0
	sub rdx, rdi
	mov rdi, rcx
	sub rdi, rdx
	lea r8, [rdx + rax]
	cmp rdi, rax
	cmovb r8, rcx
	sub rsi, rdx
	add rsi, r8
	mov qword ptr [rsp + 16], rsi
	mov qword ptr [rsp + 32], 0
	mov rdi, rsp
	call core::ptr::drop_in_place<<alloc::collections::vec_deque::drain::Drain<T,A> as core::ops::drop::Drop>::drop::DropGuard<i32,alloc::alloc::Global>>
	add rsp, 40

advance:
	mov rcx, qword ptr [rdi + 24]
	mov rax, rcx
	sub rax, rsi
	jbe .LBB3_1
	sub rsp, 40
	mov qword ptr [rdi + 24], 0
	mov qword ptr [rsp], rdi
	mov qword ptr [rsp + 8], rsi
	mov qword ptr [rsp + 16], 0
	mov qword ptr [rsp + 24], rax
	mov qword ptr [rsp + 32], rsi
	test rsi, rsi
	je .LBB3_6
	mov rax, qword ptr [rdi]
	mov rcx, qword ptr [rdi + 16]
	xor edx, edx
	cmp rcx, rax
	cmovae rdx, rax
	sub rcx, rdx
	mov rdx, rax
	sub rdx, rcx
	lea rdi, [rcx + rsi]
	cmp rdx, rsi
	cmovb rdi, rax
	sub rdi, rcx
	mov qword ptr [rsp + 16], rdi
	mov qword ptr [rsp + 32], 0
.LBB3_6:
	mov rdi, rsp
	call core::ptr::drop_in_place<<alloc::collections::vec_deque::drain::Drain<T,A> as core::ops::drop::Drop>::drop::DropGuard<i32,alloc::alloc::Global>>
	add rsp, 40
	ret
.LBB3_1:
	test rcx, rcx
	je .LBB3_3
	mov qword ptr [rdi + 24], 0
.LBB3_3:
	mov qword ptr [rdi + 16], 0
	ret

remove:
	sub rsp, 40
	cmp rdx, rsi
	jb .LBB4_5
	mov rax, qword ptr [rdi + 24]
	mov rcx, rax
	sub rcx, rdx
	jb .LBB4_6
	mov qword ptr [rdi + 24], rsi
	mov qword ptr [rsp], rdi
	sub rdx, rsi
	mov qword ptr [rsp + 8], rdx
	mov qword ptr [rsp + 16], rsi
	mov qword ptr [rsp + 24], rcx
	mov qword ptr [rsp + 32], rdx
	je .LBB4_4
	mov rax, qword ptr [rdi]
	mov rcx, qword ptr [rdi + 16]
	add rcx, rsi
	xor edi, edi
	cmp rcx, rax
	cmovae rdi, rax
	sub rcx, rdi
	mov rdi, rax
	sub rdi, rcx
	lea r8, [rcx + rdx]
	cmp rdi, rdx
	cmovb r8, rax
	sub rsi, rcx
	add rsi, r8
	mov qword ptr [rsp + 16], rsi
	mov qword ptr [rsp + 32], 0
.LBB4_4:
	mov rdi, rsp
	call core::ptr::drop_in_place<<alloc::collections::vec_deque::drain::Drain<T,A> as core::ops::drop::Drop>::drop::DropGuard<i32,alloc::alloc::Global>>
	add rsp, 40
	ret
.LBB4_5:
	lea rax, [rip + .L__unnamed_2]
	mov rdi, rsi
	mov rsi, rdx
	mov rdx, rax
	call qword ptr [rip + core::slice::index::slice_index_order_fail@GOTPCREL]
.LBB4_6:
	lea rcx, [rip + .L__unnamed_2]
	mov rdi, rdx
	mov rsi, rax
	mov rdx, rcx
	call qword ptr [rip + core::slice::index::slice_end_index_len_fail@GOTPCREL]

core::ptr::drop_in_place<<alloc::collections::vec_deque::drain::Drain<T,A> as core::ops::drop::Drop>::drop::DropGuard<i32,alloc::alloc::Global>>:
	push rbp
	push r15
	push r14
	push r13
	push r12
	push rbx
	sub rsp, 24
	mov rsi, qword ptr [rdi + 32]
	test rsi, rsi
	je .LBB0_2
	mov rax, qword ptr [rdi + 16]
	add rsi, rax
	jb .LBB0_45
.LBB0_2:
	mov r13, qword ptr [rdi]
	mov rbp, qword ptr [rdi + 8]
	mov rbx, qword ptr [r13 + 24]
	lea r12, [rbx + rbp]
	mov r15, qword ptr [rdi + 24]
	lea rsi, [r15 + r12]
	test rbx, rbx
	je .LBB0_10
	test r15, r15
	je .LBB0_42
	cmp rbx, r15
	jbe .LBB0_12
	mov r14, qword ptr [r13]
	mov rax, qword ptr [r13 + 16]
	add r12, rax
	xor ecx, ecx
	cmp r12, r14
	mov rdx, r14
	cmovb rdx, rcx
	sub r12, rdx
	add rbx, rax
	cmp rbx, r14
	cmovae rcx, r14
	sub rbx, rcx
	mov rcx, rbx
	sub rcx, r12
	je .LBB0_42
	mov rdi, qword ptr [r13 + 8]
	mov rax, rcx
	add rax, r14
	cmovae rax, rcx
	mov r8, r14
	sub r8, r12
	mov rcx, r14
	sub rcx, rbx
	mov rdx, r15
	sub rdx, r8
	mov qword ptr [rsp + 16], rsi
	jbe .LBB0_18
	cmp rax, r15
	jae .LBB0_24
	mov rdx, r15
	sub rdx, r8
	shl rdx, 2
	cmp r15, rcx
	jbe .LBB0_30
	sub r8, rcx
	mov qword ptr [rsp], rdi
	mov rax, qword ptr [rsp]
	lea rdi, [rax + 4*r8]
	mov rsi, qword ptr [rsp]
	mov qword ptr [rsp + 8], rcx
	mov r15, r8
	call qword ptr [rip + memmove@GOTPCREL]
	sub r14, r15
	mov rax, qword ptr [rsp]
	lea rsi, [rax + 4*r14]
	shl r15, 2
	mov rdi, qword ptr [rsp]
	mov rdx, r15
	call qword ptr [rip + memmove@GOTPCREL]
	mov rdi, qword ptr [rsp]
	lea rsi, [rdi + 4*r12]
	lea rdi, [rdi + 4*rbx]
	mov r15, qword ptr [rsp + 8]
	jmp .LBB0_36
.LBB0_10:
	test r15, r15
	je .LBB0_17
	mov rax, qword ptr [r13]
	sub rsi, rbp
	add rbp, qword ptr [r13 + 16]
	xor ecx, ecx
	cmp rbp, rax
	cmovae rcx, rax
	sub rbp, rcx
	mov qword ptr [r13 + 16], rbp
	jmp .LBB0_43
.LBB0_12:
	mov rdx, qword ptr [r13 + 16]
	mov r15, qword ptr [r13]
	lea rax, [rdx + rbp]
	xor ecx, ecx
	cmp rax, r15
	cmovae rcx, r15
	mov r12, rax
	sub r12, rcx
	mov rcx, r12
	sub rcx, rdx
	je .LBB0_41
	mov rdi, qword ptr [r13 + 8]
	mov rax, rcx
	add rax, r15
	cmovae rax, rcx
	mov r8, r15
	sub r8, rdx
	mov rcx, r15
	sub rcx, r12
	mov r14, rbx
	sub r14, r8
	mov qword ptr [rsp + 16], rsi
	jbe .LBB0_21
	cmp rax, rbx
	jae .LBB0_26
	mov qword ptr [rsp], rdx
	mov rdx, rbx
	sub rdx, r8
	shl rdx, 2
	cmp rbx, rcx
	jbe .LBB0_32
	sub r8, rcx
	mov rbx, rdi
	lea rdi, [rdi + 4*r8]
	mov rsi, rbx
	mov qword ptr [rsp + 8], rcx
	mov r14, r8
	call qword ptr [rip + memmove@GOTPCREL]
	sub r15, r14
	lea rsi, [rbx + 4*r15]
	shl r14, 2
	mov rdi, rbx
	mov rdx, r14
	call qword ptr [rip + memmove@GOTPCREL]
	mov rdi, rbx
	mov rax, qword ptr [rsp]
	lea rsi, [rbx + 4*rax]
	lea rdi, [rbx + 4*r12]
	mov rbx, qword ptr [rsp + 8]
	jmp .LBB0_40
.LBB0_17:
	xorps xmm0, xmm0
	movups xmmword ptr [r13 + 16], xmm0
	jmp .LBB0_44
.LBB0_18:
	mov r14, r15
	sub r14, rcx
	jbe .LBB0_28
	cmp rax, r15
	jae .LBB0_33
	lea rax, [rcx + r12]
	sub r15, rcx
	lea rsi, [rdi + 4*rax]
	shl r15, 2
	mov r14, rdi
	mov rdx, r15
	mov r15, rcx
	jmp .LBB0_31
.LBB0_21:
	mov r14, rbx
	sub r14, rcx
	jbe .LBB0_29
	cmp rax, rbx
	jae .LBB0_34
	lea rax, [rcx + rdx]
	sub rbx, rcx
	lea rsi, [rdi + 4*rax]
	shl rbx, 2
	mov r14, rdi
	mov r15, rdx
	mov rdx, rbx
	mov rbx, rcx
	call qword ptr [rip + memmove@GOTPCREL]
	mov rdi, r14
	lea rsi, [r14 + 4*r15]
	lea rdi, [r14 + 4*r12]
	jmp .LBB0_40
.LBB0_24:
	sub r15, rcx
	jbe .LBB0_35
	sub rcx, r8
	mov qword ptr [rsp + 8], rcx
	lea rsi, [rdi + 4*r12]
	mov r12, rdi
	lea rdi, [rdi + 4*rbx]
	lea rdx, [4*r8]
	mov r14, r8
	call qword ptr [rip + memmove@GOTPCREL]
	add r14, rbx
	lea rdi, [r12 + 4*r14]
	mov rbx, qword ptr [rsp + 8]
	lea rdx, [4*rbx]
	mov rsi, r12
	call qword ptr [rip + memmove@GOTPCREL]
	mov rdi, r12
	lea rsi, [r12 + 4*rbx]
	jmp .LBB0_36
.LBB0_26:
	sub rbx, rcx
	jbe .LBB0_37
	sub rcx, r8
	lea rsi, [rdi + 4*rdx]
	mov r15, rdi
	lea rdi, [rdi + 4*r12]
	lea rdx, [4*r8]
	mov r14, rcx
	mov qword ptr [rsp], r8
	call qword ptr [rip + memmove@GOTPCREL]
	add r12, qword ptr [rsp]
	lea rdi, [r15 + 4*r12]
	lea rdx, [4*r14]
	mov rsi, r15
	call qword ptr [rip + memmove@GOTPCREL]
	mov rdi, r15
	lea rsi, [r15 + 4*r14]
	jmp .LBB0_40
.LBB0_28:
	lea rsi, [rdi + 4*r12]
	lea rdi, [rdi + 4*rbx]
	jmp .LBB0_36
.LBB0_29:
	lea rsi, [rdi + 4*rdx]
	lea rdi, [rdi + 4*r12]
	jmp .LBB0_40
.LBB0_30:
	lea rax, [r8 + rbx]
	mov r14, rdi
	lea rdi, [rdi + 4*rax]
	mov rsi, r14
	mov r15, r8
.LBB0_31:
	call qword ptr [rip + memmove@GOTPCREL]
	mov rdi, r14
	lea rsi, [r14 + 4*r12]
	lea rdi, [r14 + 4*rbx]
	jmp .LBB0_36
.LBB0_32:
	lea rax, [r12 + r8]
	mov rbx, rdi
	lea rdi, [rdi + 4*rax]
	mov rsi, rbx
	mov r14, r8
	call qword ptr [rip + memmove@GOTPCREL]
	mov rdi, rbx
	mov rax, qword ptr [rsp]
	lea rsi, [rbx + 4*rax]
	jmp .LBB0_38
.LBB0_33:
	lea rsi, [rdi + 4*r12]
	mov r15, rdi
	lea rdi, [rdi + 4*rbx]
	lea rdx, [4*rcx]
	mov rbx, rcx
	call qword ptr [rip + memmove@GOTPCREL]
	mov rdi, r15
	add rbx, r12
	lea rsi, [r15 + 4*rbx]
	mov r15, r14
	jmp .LBB0_36
.LBB0_34:
	lea rsi, [rdi + 4*rdx]
	mov rbx, rdi
	lea rdi, [rdi + 4*r12]
	mov r15, rdx
	lea rdx, [4*rcx]
	mov r12, rcx
	call qword ptr [rip + memmove@GOTPCREL]
	mov rdi, rbx
	add r12, r15
	lea rsi, [rbx + 4*r12]
	jmp .LBB0_39
.LBB0_35:
	lea rsi, [rdi + 4*r12]
	mov r14, rdi
	lea rdi, [rdi + 4*rbx]
	mov r12, rdx
	lea rdx, [4*r8]
	mov r15, r8
	call qword ptr [rip + memmove@GOTPCREL]
	add r15, rbx
	mov rsi, r14
	lea rdi, [r14 + 4*r15]
	mov r15, r12
.LBB0_36:
	shl r15, 2
	mov rdx, r15
	call qword ptr [rip + memmove@GOTPCREL]
	mov rsi, qword ptr [rsp + 16]
	jmp .LBB0_42
.LBB0_37:
	lea rsi, [rdi + 4*rdx]
	mov rbx, rdi
	lea rdi, [rdi + 4*r12]
	lea rdx, [4*r8]
	mov r15, r8
	call qword ptr [rip + memmove@GOTPCREL]
	add r12, r15
	mov rsi, rbx
.LBB0_38:
	lea rdi, [rbx + 4*r12]
.LBB0_39:
	mov rbx, r14
.LBB0_40:
	shl rbx, 2
	mov rdx, rbx
	call qword ptr [rip + memmove@GOTPCREL]
	mov r15, qword ptr [r13]
	mov rax, qword ptr [r13 + 16]
	add rax, rbp
	mov rsi, qword ptr [rsp + 16]
.LBB0_41:
	xor ecx, ecx
	cmp rax, r15
	cmovae rcx, r15
	sub rax, rcx
	mov qword ptr [r13 + 16], rax
.LBB0_42:
	sub rsi, rbp
.LBB0_43:
	mov qword ptr [r13 + 24], rsi
.LBB0_44:
	add rsp, 24
	pop rbx
	pop r12
	pop r13
	pop r14
	pop r15
	pop rbp
	ret
.LBB0_45:
	lea rdx, [rip + .L__unnamed_1]
	mov rdi, rax
	call qword ptr [rip + core::slice::index::slice_index_order_fail@GOTPCREL]
```

after:
```asm
clear:
	movups xmmword ptr [rdi + 16], xmm0
	ret

truncate:
	cmp qword ptr [rdi + 24], rsi
	jbe .LBB2_4
	test rsi, rsi
	jne .LBB2_3
	mov qword ptr [rdi + 16], 0
.LBB2_3:
	mov qword ptr [rdi + 24], rsi
.LBB2_4:
	ret

advance:
	mov rcx, qword ptr [rdi + 24]
	mov rax, rcx
	sub rax, rsi
	jbe .LBB3_1
	mov rcx, qword ptr [rdi]
	add rsi, qword ptr [rdi + 16]
	xor edx, edx
	cmp rsi, rcx
	cmovae rdx, rcx
	sub rsi, rdx
	mov qword ptr [rdi + 16], rsi
	mov qword ptr [rdi + 24], rax
	ret
.LBB3_1:
	test rcx, rcx
	je .LBB3_3
	mov qword ptr [rdi + 24], 0
.LBB3_3:
	mov qword ptr [rdi + 16], 0
	ret

remove:
	push rbp
	push r15
	push r14
	push r13
	push r12
	push rbx
	push rax
	mov r15, rsi
	mov r14, rdx
	sub r14, rsi
	jb .LBB4_9
	mov rbx, rdi
	mov r12, qword ptr [rdi + 24]
	mov r13, r12
	sub r13, rdx
	jb .LBB4_10
	mov qword ptr [rbx + 24], r15
	mov rbp, r12
	sub rbp, r14
	test r15, r15
	je .LBB4_4
	cmp rbp, r15
	jne .LBB4_11
.LBB4_4:
	cmp r12, r14
	jne .LBB4_6
.LBB4_5:
	mov qword ptr [rbx + 16], 0
	jmp .LBB4_8
.LBB4_11:
	mov rdi, rbx
	mov rsi, r14
	mov rdx, r15
	mov rcx, r13
	call <<alloc::collections::vec_deque::drain::Drain<T,A> as core::ops::drop::Drop>::drop::DropGuard<T,A> as core::ops::drop::Drop>::drop::copy_data
	cmp r12, r14
	je .LBB4_5
.LBB4_6:
	cmp r13, r15
	jbe .LBB4_8
	mov rax, qword ptr [rbx]
	add r14, qword ptr [rbx + 16]
	xor ecx, ecx
	cmp r14, rax
	cmovae rcx, rax
	sub r14, rcx
	mov qword ptr [rbx + 16], r14
.LBB4_8:
	mov qword ptr [rbx + 24], rbp
	add rsp, 8
	pop rbx
	pop r12
	pop r13
	pop r14
	pop r15
	pop rbp
	ret
.LBB4_9:
	lea rax, [rip + .L__unnamed_1]
	mov rdi, r15
	mov rsi, rdx
	mov rdx, rax
	call qword ptr [rip + core::slice::index::slice_index_order_fail@GOTPCREL]
.LBB4_10:
	lea rax, [rip + .L__unnamed_1]
	mov rdi, rdx
	mov rsi, r12
	mov rdx, rax
	call qword ptr [rip + core::slice::index::slice_end_index_len_fail@GOTPCREL]

<<alloc::collections::vec_deque::drain::Drain<T,A> as core::ops::drop::Drop>::drop::DropGuard<T,A> as core::ops::drop::Drop>::drop::copy_data:
	push rbp
	push r15
	push r14
	push r13
	push r12
	push rbx
	push rax
	mov r14, rsi
	cmp rdx, rcx
	jae .LBB0_1
	mov r12, qword ptr [rdi]
	mov rax, qword ptr [rdi + 16]
	add r14, rax
	xor ecx, ecx
	cmp r14, r12
	cmovae rcx, r12
	sub r14, rcx
	mov r15, rdx
	mov r13, r14
	mov r14, rax
	mov rcx, r13
	sub rcx, r14
	je .LBB0_18
.LBB0_4:
	mov rdi, qword ptr [rdi + 8]
	mov rax, rcx
	add rax, r12
	cmovae rax, rcx
	mov rbx, r12
	sub rbx, r14
	mov rcx, r12
	sub rcx, r13
	mov rbp, r15
	sub rbp, rbx
	jbe .LBB0_5
	cmp rax, r15
	jae .LBB0_12
	mov rdx, r15
	sub rdx, rbx
	shl rdx, 2
	cmp r15, rcx
	jbe .LBB0_16
	sub rbx, rcx
	mov rbp, rdi
	lea rdi, [rdi + 4*rbx]
	mov r15, qword ptr [rip + memmove@GOTPCREL]
	mov rsi, rbp
	mov qword ptr [rsp], rcx
	call r15
	sub r12, rbx
	lea rsi, [4*r12]
	add rsi, rbp
	shl rbx, 2
	mov rdi, rbp
	mov rdx, rbx
	call r15
	mov rdi, rbp
	lea rsi, [4*r14]
	add rsi, rbp
	lea rdi, [4*r13]
	add rdi, rbp
	mov r15, qword ptr [rsp]
	jmp .LBB0_7
.LBB0_1:
	mov r15, rcx
	add r14, rdx
	mov r12, qword ptr [rdi]
	mov r13, qword ptr [rdi + 16]
	add r14, r13
	xor eax, eax
	cmp r14, r12
	mov rcx, r12
	cmovb rcx, rax
	sub r14, rcx
	add r13, rdx
	cmp r13, r12
	cmovae rax, r12
	sub r13, rax
	mov rcx, r13
	sub rcx, r14
	jne .LBB0_4
.LBB0_18:
	add rsp, 8
	pop rbx
	pop r12
	pop r13
	pop r14
	pop r15
	pop rbp
	ret
.LBB0_5:
	mov rbx, r15
	sub rbx, rcx
	jbe .LBB0_6
	cmp rax, r15
	jae .LBB0_9
	lea rax, [rcx + r14]
	sub r15, rcx
	lea rsi, [rdi + 4*rax]
	shl r15, 2
	mov rbx, rdi
	mov rdx, r15
	mov r15, rcx
	call qword ptr [rip + memmove@GOTPCREL]
	mov rdi, rbx
	lea rsi, [rbx + 4*r14]
	lea rdi, [rbx + 4*r13]
	jmp .LBB0_7
.LBB0_12:
	sub r15, rcx
	jbe .LBB0_13
	sub rcx, rbx
	lea rsi, [rdi + 4*r14]
	mov r12, rdi
	lea rdi, [rdi + 4*r13]
	lea rdx, [4*rbx]
	mov r14, qword ptr [rip + memmove@GOTPCREL]
	mov rbp, rcx
	call r14
	add rbx, r13
	lea rdi, [r12 + 4*rbx]
	lea rdx, [4*rbp]
	mov rsi, r12
	call r14
	mov rdi, r12
	lea rsi, [r12 + 4*rbp]
	jmp .LBB0_7
.LBB0_6:
	lea rsi, [rdi + 4*r14]
	lea rdi, [rdi + 4*r13]
	jmp .LBB0_7
.LBB0_16:
	lea rax, [rbx + r13]
	mov r15, rdi
	lea rdi, [rdi + 4*rax]
	mov rsi, r15
	call qword ptr [rip + memmove@GOTPCREL]
	mov rdi, r15
	lea rsi, [r15 + 4*r14]
	lea rdi, [r15 + 4*r13]
	mov r15, rbx
	jmp .LBB0_7
.LBB0_9:
	lea rsi, [rdi + 4*r14]
	mov r15, rdi
	lea rdi, [rdi + 4*r13]
	lea rdx, [4*rcx]
	mov r12, rcx
	call qword ptr [rip + memmove@GOTPCREL]
	mov rdi, r15
	add r12, r14
	lea rsi, [r15 + 4*r12]
	mov r15, rbx
	jmp .LBB0_7
.LBB0_13:
	lea rsi, [rdi + 4*r14]
	mov r14, rdi
	lea rdi, [rdi + 4*r13]
	lea rdx, [4*rbx]
	call qword ptr [rip + memmove@GOTPCREL]
	add rbx, r13
	mov rsi, r14
	lea rdi, [r14 + 4*rbx]
	mov r15, rbp
.LBB0_7:
	shl r15, 2
	mov rdx, r15
	add rsp, 8
	pop rbx
	pop r12
	pop r13
	pop r14
	pop r15
	pop rbp
	jmp qword ptr [rip + memmove@GOTPCREL]
```

</details>
2024-02-18 00:03:39 +00:00
Ben Kimock
7c2db703b0 Don't use mem::zeroed in vec::IntoIter 2024-02-16 10:44:39 -05:00
Lukas Markeffsky
8f259ade66 add codegen test 2024-02-16 13:11:05 +01:00
bors
dfa88b328f Auto merge of #120500 - oli-obk:intrinsics2.0, r=WaffleLapkin
Implement intrinsics with fallback bodies

fixes #93145 (though we can port many more intrinsics)
cc #63585

The way this works is that the backend logic for generating custom code for intrinsics has been made fallible. The only failure path is "this intrinsic is unknown". The `Instance` (that was `InstanceDef::Intrinsic`) then gets converted to `InstanceDef::Item`, which represents the fallback body. A regular function call to that body is then codegenned. This is currently implemented for

* codegen_ssa (so llvm and gcc)
* codegen_cranelift

other backends will need to adjust, but they can just keep doing what they were doing if they prefer (though adding new intrinsics to the compiler will then require them to implement them, instead of getting the fallback body).

cc `@scottmcm` `@WaffleLapkin`

### todo

* [ ] miri support
* [x] default intrinsic name to name of function instead of requiring it to be specified in attribute
* [x] make sure that the bodies are always available (must be collected for metadata)
2024-02-16 09:53:01 +00:00
Augie Fackler
a6ee72df91 tests: LLVM 18 infers an extra noalias here
This test started failing on LLVM 18 after change
61118ffd04. As far as I can tell, it's
just good fortune that LLVM is able to sniff out the new noalias here,
and it's correct.
2024-02-13 10:33:40 +01:00
Oli Scherer
f35a2bd401 Support safe intrinsics with fallback bodies
Turn `is_val_statically_known` into such an intrinsic to demonstrate. It is perfectly safe to call after all.
2024-02-12 17:55:36 +00:00
Matthias Krüger
1843dfd0d5
Rollup merge of #118307 - scottmcm:tuple-eq-simpler, r=joshtriplett
Remove an unneeded helper from the tuple library code

Thanks to https://github.com/rust-lang/rust/pull/107022, this is just what `==` does, so we don't need the helper here anymore.
2024-02-11 08:25:41 +01:00
Michael Goulet
34ed554d81 Build DebugInfo for coroutine-closure 2024-02-09 16:01:29 +00:00
Guillaume Boisseau
7954c28cf9
Rollup merge of #119162 - heiher:direct-access-external-data, r=petrochenkov
Add unstable `-Z direct-access-external-data` cmdline flag for `rustc`

The new flag has been described in the Major Change Proposal at https://github.com/rust-lang/compiler-team/issues/707

Fixes #118053
2024-02-07 18:24:41 +01:00
Matthias Krüger
59ba8024af
Rollup merge of #120502 - clubby789:remove-ffi-returns-twice, r=compiler-errors
Remove `ffi_returns_twice` feature

The [tracking issue](https://github.com/rust-lang/rust/issues/58314) and [RFC](https://github.com/rust-lang/rfcs/pull/2633) have been closed for a couple of years.

There is also an attribute gate in R-A which should be removed if this lands.
2024-02-06 22:45:42 +01:00
bors
268dbbbc4b Auto merge of #120624 - matthiaskrgr:rollup-3gvcl20, r=matthiaskrgr
Rollup of 8 pull requests

Successful merges:

 - #120484 (Avoid ICE when is_val_statically_known is not of a supported type)
 - #120516 (pattern_analysis: cleanup manual impls)
 - #120517 (never patterns: It is correct to lower `!` to `_`.)
 - #120523 (Improve `io::Read::read_buf_exact` error case)
 - #120528 (Store SHOULD_CAPTURE as AtomicU8)
 - #120529 (Update data layouts in custom target tests for LLVM 18)
 - #120531 (Remove a bunch of `has_errors` checks that have no meaningful or the wrong effect)
 - #120533 (Correct paths for hexagon-unknown-none-elf platform doc)

r? `@ghost`
`@rustbot` modify labels: rollup
2024-02-04 20:51:28 +00:00
Matthias Krüger
6f24836a5b
Rollup merge of #120484 - Teapot4195:issue-120480-fix, r=compiler-errors
Avoid ICE when is_val_statically_known is not of a supported type

2 ICE with 1 stone!
1. Implement `llvm.is.constant.ptr` to avoid first ICE in linked issue.
2. return `false` when the argument is not one of `i*`/`f*`/`ptr` to avoid second ICE.

fixes #120480
2024-02-03 22:25:14 +01:00
Oli Scherer
6ac035df44 Revert unsound libcore changes of #119911 2024-02-01 22:53:25 +00:00
clubby789
7331315898 Remove ffi_returns_twice feature 2024-01-30 22:09:09 +00:00
Alex Huang
a97ff2a750 Add additional test cases for is_val_statically_known 2024-01-30 14:37:59 -05:00
Guillaume Gomez
6a1d34f32a
Rollup merge of #120310 - krasimirgg:jan-v0-sym, r=Mark-Simulacrum
adapt test for v0 symbol mangling

No functional changes intended.

Adapts the test to also work under `new-symbol-mangling = true`.
2024-01-30 16:57:48 +01:00
Nikita Popov
bdf7404b43 Update codegen test for LLVM 18 2024-01-26 15:03:23 +01:00
bors
039d887928 Auto merge of #119911 - NCGThompson:is-statically-known, r=oli-obk
Replacement of #114390: Add new intrinsic `is_var_statically_known` and optimize pow for powers of two

This adds a new intrinsic `is_val_statically_known` that lowers to [``@llvm.is.constant.*`](https://llvm.org/docs/LangRef.html#llvm-is-constant-intrinsic).` It also applies the intrinsic in the int_pow methods to recognize and optimize the idiom `2isize.pow(x)`. See #114390 for more discussion.

While I have extended the scope of the power of two optimization from #114390, I haven't added any new uses for the intrinsic. That can be done in later pull requests.

Note: When testing or using the library, be sure to use `--stage 1` or higher. Otherwise, the intrinsic will be a noop and the doctests will be skipped. If you are trying out edits, you may be interested in [`--keep-stage 0`](https://rustc-dev-guide.rust-lang.org/building/suggested.html#faster-builds-with---keep-stage).

Fixes #47234
Resolves #114390
`@Centri3`
2024-01-25 05:16:53 +00:00
Krasimir Georgiev
e23937c6d3 adapt test for v0 symbol mangling
No functional changes intended.

Adapts the test to also work under new-symbol-mangling = true.
2024-01-24 14:57:21 +00:00
Nicholas Thompson
9dccd5dce1 Further Implement Power of Two Optimization 2024-01-23 12:03:50 -05:00
Nicholas Thompson
971e37ff7e Further Implement is_val_statically_known 2024-01-23 12:02:31 -05:00
Nikita Popov
31f5f033e9 Remove uses of no-system-llvm
It looks like none of these are actually needed.
2024-01-23 10:31:07 +01:00
Nikita Popov
823e8b041a Allow disjoint flag in codegen test 2024-01-23 10:12:36 +01:00
bors
e35a56d96f Auto merge of #119892 - joboet:libs_use_assert_unchecked, r=Nilstrieb,cuviper
Use `assert_unchecked` instead of `assume` intrinsic in the standard library

Now that a public wrapper for the `assume` intrinsic exists, we can use it in the standard library.

CC #119131
2024-01-23 06:45:58 +00:00
joboet
638439a440
update codegen tests 2024-01-22 15:46:32 +01:00
AngelicosPhosphoros
60208a0517 Tweak the threshold for chunked swapping
Thanks to 98892 for the tests I brought in here, as it demonstrated that 3×usize is currently suboptimal.
2024-01-19 23:00:34 -08:00
Catherine Flores
5a4561749a Add new intrinsic is_constant and optimize pow
Fix overflow check

Make MIRI choose the path randomly and rename the intrinsic

Add back test

Add miri test and make it operate on `ptr`

Define `llvm.is.constant` for primitives

Update MIRI comment and fix test in stage2

Add const eval test

Clarify that both branches must have the same side effects

guaranteed non guarantee

use immediate type instead

Co-Authored-By: Ralf Jung <post@ralfj.de>
2024-01-19 13:46:27 -05:00
Nikita Popov
ce2d91dccd Directly use volatile_load intrinsic
This makes the test work if libstd is compiled with debug assertions.
2024-01-19 10:52:01 +01:00
Nikita Popov
7a0415ce37 Add codegen test for ScalarPair with i128 on LLVM 17 2024-01-19 10:52:01 +01:00
bors
bf2637f4e8 Auto merge of #119954 - scottmcm:option-unwrap-failed, r=WaffleLapkin
Split out `option::unwrap_failed` like we have `result::unwrap_failed`

...and like `option::expect_failed`
2024-01-16 15:32:39 +00:00
WANG Rui
06a41687b1 Add unstable -Z direct-access-external-data cmdline flag for rustc
The new flag has been described in the Major Change Proposal at
https://github.com/rust-lang/compiler-team/issues/707
2024-01-16 19:15:06 +08:00
bors
1ead4761e9 Auto merge of #119878 - scottmcm:inline-always-unwrap, r=workingjubilee
Tune the inlinability of `unwrap`

Fixes #115463
cc `@thomcc`

This tweaks `unwrap` on ~~`Option` &~~ `Result` to be two parts:
- `#[inline(always)]` for checking the discriminant
- `#[cold]` for actually panicking

The idea here is that checking the discriminant on a `Result` ~~or `Option`~~ should always be trivial enough to be worth inlining, even in `opt-level=z`, especially compared to passing it to a function.

As seen in the issue and codegen test, this will hopefully help particularly for things like `.try_into().unwrap()`s that are actually infallible, but in a way that's only visible with the inlining.

EDIT: I've restricted this to `Result` to avoid combining effects
2024-01-15 09:20:46 +00:00
Scott McMurray
23483664a2 Split out option::unwrap_failed like we have result::unwrap_failed
...and like `option::expect_failed`
2024-01-14 12:45:01 -08:00
bors
2319be8e26 Auto merge of #119452 - AngelicosPhosphoros:make_nonzeroint_get_assume_nonzero, r=scottmcm
Add assume into `NonZeroIntX::get`

LLVM currently don't support range metadata for function arguments so it fails to optimize non zero integers using their invariant if they are provided using by-value function arguments.

Related to https://github.com/rust-lang/rust/issues/119422
Related to https://github.com/llvm/llvm-project/issues/76628
Related to https://github.com/rust-lang/rust/issues/49572
2024-01-12 20:18:04 +00:00
Scott McMurray
b858c591dd Tune the inlinability of Result::unwrap 2024-01-12 10:57:58 -08:00
The 8472
93b34a5ffa mark vec::IntoIter pointers as !nonnull 2024-01-07 03:44:04 +01:00
AngelicosPhosphoros
8f432d4ae6 Add assume into NonZeroIntX::get
LLVM currently don't support range metadata for function arguments so it fails to optimize non zero integers using their invariant if they are provided using by-value function arguments.

Related to https://github.com/rust-lang/rust/issues/119422
Related to https://github.com/llvm/llvm-project/issues/76628
Related to https://github.com/rust-lang/rust/issues/49572
2024-01-06 14:26:37 +01:00
bors
432fffa8af Auto merge of #118991 - nikic:scalar-pair, r=nagisa
Separate immediate and in-memory ScalarPair representation

Currently, we assume that ScalarPair is always represented using a two-element struct, both as an immediate value and when stored in memory.

This currently works fairly well, but runs into problems with https://github.com/rust-lang/rust/pull/116672, where a ScalarPair involving an i128 type can no longer be represented as a two-element struct in memory. For example, the tuple `(i32, i128)` needs to be represented in-memory as `{ i32, [3 x i32], i128 }` to satisfy alignment requirements. Using `{ i32, i128 }` instead will result in the second element being stored at the wrong offset (prior to LLVM 18).

Resolve this issue by no longer requiring that the immediate and in-memory type for ScalarPair are the same. The in-memory type will now look the same as for normal struct types (and will include padding filler and similar), while the immediate type stays a simple two-element struct type. This also means that booleans in immediate ScalarPair are now represented as i1 rather than i8, just like we do everywhere else.

The core change here is to llvm_type (which now treats ScalarPair as a normal struct) and immediate_llvm_type (which returns the two-element struct that llvm_type used to produce). The rest is fixing things up to no longer assume these are the same. In particular, this switches places that try to get pointers to the ScalarPair elements to use byte-geps instead of struct-geps.
2024-01-05 14:31:56 +00:00
Nikita Popov
3cd6cde0be Make test compatible with 32-bit as well 2024-01-05 11:45:57 +01:00
Matthias Krüger
c505d760a6
Rollup merge of #119555 - Kobzol:maybeuninit-rvo-codegen-test, r=nikic
Add codegen test for RVO on MaybeUninit

Codegen test for https://github.com/rust-lang/rust/issues/90595. Currently, this only works with `-Cpanic=abort`, but hopefully in the [future](https://www.npopov.com/2024/01/01/This-year-in-LLVM-2023.html#writable-and-dead_on_unwind) it should also work in the presence of panics.

r? ``@nikic``
2024-01-04 08:33:26 +01:00
Jakub Beránek
0c56ccff04
Add codegen test for RVO on MaybeUninit
Currently, this only works with `-Cpanic=abort`.
2024-01-03 21:18:07 +01:00
León Orell Valerian Liehr
fcec407f4a
Rollup merge of #119523 - maurer:fix-sparc-llvm-18, r=nikic
llvm: Allow `noundef` in codegen tests

LLVM 18 will automatically infer `noundef` in some situations. Adjust codegen tests to accept this.

See llvm/llvm-project#76553 for why `noundef` is being generated now.

``@rustbot`` label:+llvm-main
2024-01-03 16:08:32 +01:00
Matthew Maurer
ee86b1f84c llvm: Allow noundef in codegen tests
LLVM 18 will automatically infer `noundef` in some situations.
Adjust codegen tests to accept this.

See llvm/llvm-project#76553 for why `noundef` is being generated now.
2024-01-02 18:02:17 +00:00
Nikita Popov
8e64fc94d8 Address review comments 2024-01-02 15:03:14 +01:00
Camille GILLOT
6dfda0d32f Revert codegen test change. 2023-12-24 20:08:58 +00:00
Camille GILLOT
2837727471 Replace legacy ConstProp by GVN. 2023-12-24 20:08:57 +00:00
Camille GILLOT
a03c972816 Enable GVN by default. 2023-12-24 20:08:57 +00:00
Augie Fackler
58fdbd1479 tests: fix overaligned-constant to not over-specify getelementptr instr
On LLVM 18 we get slightly different arguments here, so it's easier to
just regex those away. The important details are all still asserted as I
understand things.

Fixes #119193.

@rustbot label: +llvm-main
2023-12-21 15:53:28 -05:00
bors
920e0051cf Auto merge of #119056 - cjgillot:codegen-overalign, r=wesleywiser
Tolerate overaligned MIR constants for codegen.

Fixes https://github.com/rust-lang/rust/issues/117761

cc `@saethlin`
2023-12-21 04:01:36 +00:00
bors
51c0db6a91 Auto merge of #106790 - the8472:rawvec-niche, r=scottmcm
add more niches to rawvec

Previously RawVec only had a single niche in its `NonNull` pointer. With this change it now has `isize::MAX` niches since half the value-space of the capacity field is never needed, we can't have a capacity larger than isize::MAX.
2023-12-20 02:19:10 +00:00
Camille GILLOT
503af0deb2 Fortify test. 2023-12-17 23:31:58 +00:00
Camille GILLOT
3ea5cfaa11 Tolerate overaligned MIR constants for codegen. 2023-12-17 22:56:42 +00:00
Nikita Popov
c2fd26a115 Separate immediate and in-memory ScalarPair representation
Currently, we assume that ScalarPair is always represented using
a two-element struct, both as an immediate value and when stored
in memory.

This currently works fairly well, but runs into problems with
https://github.com/rust-lang/rust/pull/116672, where a ScalarPair
involving an i128 type can no longer be represented as a two-element
struct in memory. For example, the tuple `(i32, i128)` needs to be
represented in-memory as `{ i32, [3 x i32], i128 }` to satisfy
alignment requirement. Using `{ i32, i128 }` instead will result in
the second element being stored at the wrong offset (prior to
LLVM 18).

Resolve this issue by no longer requiring that the immediate and
in-memory type for ScalarPair are the same. The in-memory type
will now look the same as for normal struct types (and will include
padding filler and similar), while the immediate type stays a
simple two-element struct type. This also means that booleans in
immediate ScalarPair are now represented as i1 rather than i8,
just like we do everywhere else.

The core change here is to llvm_type (which now treats ScalarPair
as a normal struct) and immediate_llvm_type (which returns the
two-element struct that llvm_type used to produce). The rest is
fixing things up to no longer assume these are the same. In
particular, this switches places that try to get pointers to the
ScalarPair elements to use byte-geps instead of struct-geps.
2023-12-15 17:42:05 +01:00
Wesley Wiser
ce290514df
Adapt debug-accessibility tests for msvc-style enums 2023-12-15 11:45:03 +00:00
David Wood
07931c5a08
codegen_llvm: set DW_AT_accessibility
Sets the accessibility of types and fields in DWARF using
`DW_AT_accessibility` attribute.

`DW_AT_accessibility` (public/protected/private) isn't exactly right for
Rust,  but neither is `DW_AT_visibility` (local/exported/qualified), and
there's no way to set `DW_AT_visbility` in LLVM's API.

Signed-off-by: David Wood <david@davidtw.co>
2023-12-15 11:36:41 +00:00
bors
9d49eb76c4 Auto merge of #118417 - anforowicz:default-hidden-visibility, r=TaKO8Ki
Add unstable `-Zdefault-hidden-visibility` cmdline flag for `rustc`.

The new flag has been described in the Major Change Proposal at
https://github.com/rust-lang/compiler-team/issues/656
2023-12-14 09:16:15 +00:00
bors
e6d1b0ec98 Auto merge of #118491 - cuviper:aarch64-stack-probes, r=wesleywiser
Enable stack probes on aarch64 for LLVM 18

I tested this on `aarch64-unknown-linux-gnu` with LLVM main (~18).

cc #77071, to be closed once we upgrade our LLVM submodule.
2023-12-14 02:01:13 +00:00
Lukasz Anforowicz
981c4e3ce6 Add unstable -Zdefault-hidden-visibility cmdline flag for rustc.
The new flag has been described in the Major Change Proposal at
https://github.com/rust-lang/compiler-team/issues/656
2023-12-13 21:14:23 +00:00
Jakub Okoński
95b5a80f47
Fix alignment passed down to LLVM for simd_masked_load 2023-12-12 13:11:59 +01:00
The 8472
502df1b7d4 add more niches to rawvec 2023-12-11 23:38:48 +01:00
Jakub Okoński
97ae5095f5
Add simd_masked_{load,store} platform-intrinsics
This maps to the LLVM intrinsics: llvm.masked.load and llvm.masked.store
2023-12-09 12:36:08 +01:00
Josh Stone
b99b5e5752 Enable stack probes on aarch64 for LLVM 18 2023-12-07 17:17:00 -08:00
Ramon de C Valle
97032d63bd CFI: Add char to CFI integer normalization
Adds char to CFI integer normalization to conform to #118032 for
cross-language CFI support.
2023-12-07 11:28:16 -08:00
bendn
73afc00cf9
use assume(idx < self.len()) in [T]::get_unchecked 2023-12-04 06:00:12 +07:00
bors
3f1e30a0a5 Auto merge of #118077 - calebzulawski:sync-portable-simd-2023-11-19, r=workingjubilee
Portable SIMD subtree update

Syncs nightly to the latest changes from rust-lang/portable-simd

r? `@rust-lang/libs`
2023-12-02 18:04:01 +00:00
bors
f45631b10f Auto merge of #116892 - ojeda:rethunk, r=wesleywiser
Add `-Zfunction-return={keep,thunk-extern}` option

This is intended to be used for Linux kernel RETHUNK builds.

With this commit (optionally backported to Rust 1.73.0), plus a patched Linux kernel to pass the flag, I get a RETHUNK build with Rust enabled that is `objtool`-warning-free and is able to boot in QEMU and load a sample Rust kernel module.

Issue: https://github.com/rust-lang/rust/issues/116853.
2023-11-30 22:10:30 +00:00
Miguel Ojeda
2d476222e8 Add -Zfunction-return={keep,thunk-extern} option
This is intended to be used for Linux kernel RETHUNK builds.

With this commit (optionally backported to Rust 1.73.0), plus a
patched Linux kernel to pass the flag, I get a RETHUNK build with
Rust enabled that is `objtool`-warning-free and is able to boot in
QEMU and load a sample Rust kernel module.

Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2023-11-30 20:21:31 +01:00
bors
07921b50ba Auto merge of #118036 - DianQK:thinlto-tests, r=tmiasko
Add thinlto support to codegen, assembly and coverage tests

Using `--emit=llvm-ir` with thinlto usually result in multiple IR files.
Resolve test case failure issue reported in #113923.
2023-11-30 13:33:32 +00:00
DianQK
c41bf96039
Add thinlto support to codegen, assembly and coverage tests 2023-11-30 18:48:03 +08:00
Krasimir Georgiev
81cd7c5b11 update test for new LLVM 18 codegen
LLVM at HEAD now emits `or disjoint`: https://buildkite.com/llvm-project/rust-llvm-integrate-prototype/builds/24076#018c1596-8153-488e-b622-951266a02f6c/741-774
2023-11-28 12:10:59 +00:00
bors
49b3924bd4 Auto merge of #117947 - Dirbaio:drop-llvm-15, r=cuviper
Update the minimum external LLVM to 16.

With this change, we'll have stable support for LLVM 16 and 17.
For reference, the previous increase to LLVM 15 was #114148

[Relevant zulip discussion](https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/riscv.20forced-atomics)
2023-11-27 21:54:03 +00:00
Caleb Zulawski
4d9607869a Update std::simd usage and test outputs 2023-11-26 09:02:25 -05:00
Scott McMurray
4b3f11523d Remove an unneeded helper from the tuple library code 2023-11-25 22:25:00 -08:00
Arlie Davis
9429d68842 convert ehcont-guard to an unstable option 2023-11-21 14:24:23 -08:00
Arlie Davis
e11d8d147b Add support for generating the EHCont section
In the future Windows will enable Control-flow Enforcement Technology
(CET aka Shadow Stacks). To protect the path where the context is
updated during exception handling, the binary is required to enumerate
valid unwind entrypoints in a dedicated section which is validated when
the context is being set during exception handling.

The required support for EHCONT has already been merged into LLVM,
long ago. This change adds the Rust codegen option to enable it.

Reference:

* https://reviews.llvm.org/D40223

This also adds a new `ehcont-guard` option to the bootstrap config which
enables EHCont Guard when building std.
2023-11-21 13:41:23 -08:00
Dario Nieuwenhuis
7de6d04bc8 Update the minimum external LLVM to 16. 2023-11-21 22:40:16 +01:00
bors
0b24479638 Auto merge of #116555 - paulmenage:llvm-module-flag, r=wesleywiser
Add -Z llvm_module_flag

Allow adding values to the `!llvm.module.flags` metadata for a generated module.  The syntax is

`-Z llvm_module_flag=<name>:<type>:<value>:<behavior>`

Currently only u32 values are supported but the type is required to be specified for forward compatibility.  The `behavior` element must match one of the named LLVM metadata behaviors.viors.

This flag is expected to be perma-unstable.
2023-11-15 16:54:31 +00:00
Augie Fackler
5d8d700fd3 tests: update check for inferred nneg on zext
This was broken by upstream
llvm/llvm-project@dc6d077396. It's easy
enough to use a regex match to support both, so we do that.

r? @nikic
@rustbot label: +llvm-main
2023-11-13 10:43:33 -05:00
Paul Menage
2e6b57541d Add -Z llvm_module_flag
Allow adding values to the `!llvm.module.flags` metadata for a generated
module.  The syntax is

`-Z llvm_module_flag=<name>:<type>:<value>:<behavior>`

Currently only u32 values are supported but the type is required to be
specified for forward compatibility.  The `behavior` element must match
one of the named LLVM metadata behaviors.viors.

This flag is expected to be perma-unstable.
2023-11-11 19:48:47 -08:00
Ben Kimock
d32d9238cf Emit #[inline] on derive(Debug) 2023-11-09 10:40:55 -05:00
Ben Kimock
fcdd99edca Add -Zcross-crate-inline-threshold=yes 2023-11-07 18:45:11 -05:00
bors
f5ca57e153 Auto merge of #117503 - kornelski:hint-try-reserved, r=workingjubilee
Hint optimizer about try-reserved capacity

This is #116568, but limited only to the less-common `try_reserve` functions to reduce bloat in debug binaries from debug info, while still addressing the main use-case #116570
2023-11-05 00:03:41 +00:00
Kornel
029fbd67ef Hint optimizer about reserved capacity 2023-11-02 00:52:06 +00:00
Matthias Krüger
260e07b0cb
Rollup merge of #115626 - clarfonthey:unchecked-math, r=thomcc
Clean up unchecked_math, separate out unchecked_shifts

Tracking issue: #85122

Changes:

1. Remove `const_inherent_unchecked_arith` flag and make const-stability flags the same as the method feature flags. Given the number of other unsafe const fns already stabilised, it makes sense to just stabilise these in const context when they're stabilised.
2. Move `unchecked_shl` and `unchecked_shr` into a separate `unchecked_shifts` flag, since the semantics for them are unclear and they'll likely be stabilised separately as a result.
3. Add an `unchecked_neg` method exclusively to signed integers, under the `unchecked_neg` flag. This is because it's a new API and probably needs some time to marinate before it's stabilised, and while it *would* make sense to have a similar version for unsigned integers since `checked_neg` also exists for those there is absolutely no case where that would be a good idea, IMQHO.

The longer-term goal here is to prepare the `unchecked_math` methods for an FCP and stabilisation since they've existed for a while, their semantics are clear, and people seem in favour of stabilising them.
2023-11-01 11:29:41 +01:00
okaneco
465ffc9ca7 Refactor some char, u8 ascii functions to be branchless
Decompose singular `matches!` with or-patterns to individual `matches!`
statements to enable branchless code output. The following functions
were changed:
- `is_ascii_alphanumeric`
- `is_ascii_hexdigit`
- `is_ascii_punctuation`

Add codegen tests

Co-authored-by: George Bateman <george.bateman16@gmail.com>
Co-authored-by: scottmcm <scottmcm@users.noreply.github.com>
2023-10-26 21:48:36 -04:00