nordic-dev.net/rust - rust

mirror of https://github.com/rust-lang/rust.git synced 2024-11-26 16:54:01 +00:00

Author	SHA1	Message	Date
Taiki Endo	241f82ad91	Basic inline assembly support for SPARC and SPARC64	2024-11-07 21:19:03 +09:00
bors	e8c698bb3b	Auto merge of #129884 - RalfJung:forbidden-target-features, r=workingjubilee mark some target features as 'forbidden' so they cannot be (un)set with -Ctarget-feature The context for this is https://github.com/rust-lang/rust/issues/116344: some target features change the way floats are passed between functions. Changing those target features is unsound as code compiled for the same target may now use different ABIs. So this introduces a new concept of "forbidden" target features (on top of the existing "stable " and "unstable" categories), and makes it a hard error to (un)set such a target feature. For now, the x86 and ARM feature `soft-float` is on that list. We'll have to make some effort to collect more relevant features, and similar features from other targets, but that can happen after the basic infrastructure for this landed. (These features are being collected in https://github.com/rust-lang/rust/issues/131799.) I've made this a warning for now to give people some time to speak up if this would break something. MCP: https://github.com/rust-lang/compiler-team/issues/780	2024-11-05 16:25:45 +00:00
bors	96477c55bc	Auto merge of #131341 - taiki-e:ppc-clobber-abi, r=bzEq,workingjubilee Support clobber_abi and vector registers (clobber-only) in PowerPC inline assembly This supports `clobber_abi` which is one of the requirements of stabilization mentioned in #93335. This basically does a similar thing I did in https://github.com/rust-lang/rust/pull/130630 to implement `clobber_abi` for s390x, but for powerpc/powerpc64/powerpc64le. - This also supports vector registers (as `vreg`) as clobber-only, which need to support clobbering of them to implement `clobber_abi`. - `vreg` should be able to accept `#[repr(simd)]` types as input/output if the unstable `altivec` target feature is enabled, but `core::arch::{powerpc,powerpc64}` vector types, `#[repr(simd)]`, and `core::simd` are all unstable, so the fact that this is currently a clobber-only should not be considered a blocker of clobber_abi implementation or stabilization. So I have not implemented it in this PR. - See https://github.com/rust-lang/rust/pull/131551 (which is based on this PR) for a PR to implement this. - (I'm not sticking to whether that PR should be a separate PR or part of this PR, so I can merge that PR into this PR if needed.) Refs: - PPC32 SysV: Section "Function Calling Sequence" in [System V Application Binary Interface PowerPC Processor Supplement](https://refspecs.linuxfoundation.org/elf/elfspec_ppc.pdf) - PPC64 ELFv1: Section 3.2 "Function Calling Sequence" in [64-bit PowerPC ELF Application Binary Interface Supplement](https://refspecs.linuxfoundation.org/ELF/ppc64/PPC-elf64abi.html#FUNC-CALL) - PPC64 ELFv2: Section 2.2 "Function Calling Sequence" in [64-Bit ELF V2 ABI Specification](https://openpowerfoundation.org/specifications/64bitelfabi/) - AIX: [Register usage and conventions](https://www.ibm.com/docs/en/aix/7.3?topic=overview-register-usage-conventions), [Special registers in the PowerPC®](https://www.ibm.com/docs/en/aix/7.3?topic=overview-special-registers-in-powerpc), [AIX vector programming](https://www.ibm.com/docs/en/aix/7.3?topic=concepts-aix-vector-programming) - Register definition in LLVM: https://github.com/llvm/llvm-project/blob/llvmorg-19.1.0/llvm/lib/Target/PowerPC/PPCRegisterInfo.td#L189 If I understand the above four ABI documentations correctly, except for the PPC32 SysV's VR (Vector Registers) and 32-bit AIX (currently not supported by rustc)'s r13, there does not appear to be important differences in terms of implementing `clobber_abi`: - The above four ABIs are consistent about FPR (0-13: volatile, 14-31: nonvolatile), CR (0-1,5-7: volatile, 2-4: nonvolatile), XER (volatile), and CTR (volatile). - As for GPR, only the registers we are treating as reserved are slightly different - r0, r3-r12 are volatile - r1(sp, reserved), r14-31 are nonvolatile - r2(reserved) is TOC pointer in PPC64 ELF/AIX, system-reserved register in PPC32 SysV (AFAIK used as thread pointer in Linux/BSDs) - r13(reserved for non-32-bit-AIX) is thread pointer in PPC64 ELF, small data area pointer register in PPC32 SysV, "reserved under 64-bit environment; not restored across system calls[^r13]" in AIX) - As for FPSCR, volatile in PPC64 ELFv1/AIX, some fields are volatile only in certain situations (rest are volatile) in PPC32 SysV/PPC64 ELFv2. - As for VR (Vector Registers), it is not mentioned in PPC32 SysV, v0-v19 are volatile in both in PPC64 ELF/AIX, v20-v31 are nonvolatile in PPC64 ELF, reserved or nonvolatile depending on the ABI ([vec-extabi vs vec-default in LLVM](https://reviews.llvm.org/D89684), we are [using vec-extabi](https://github.com/rust-lang/rust/pull/131341#discussion_r1797693299)) in AIX: > When the default Vector enabled mode is used, these registers are reserved and must not be used. > In the extended ABI vector enabled mode, these registers are nonvolatile and their values are preserved across function calls I left [FIXME comment about PPC32 SysV](https://github.com/rust-lang/rust/pull/131341#discussion_r1790496095) and added ABI check for AIX. - As for VRSAVE, it is not mentioned in PPC32 SysV, nonvolatile in PPC64 ELFv1, reserved in PPC64 ELFv2/AIX - As for VSCR, it is not mentioned in PPC32 SysV/PPC64 ELFv1, some fields are volatile only in certain situations (rest are volatile) in PPC64 ELFv2, volatile in AIX We are currently treating r1-r2, r13 (non-32-bit-AIX), r29-r31, LR, CTR, and VRSAVE as reserved. We are currently not processing anything about FPSCR and VSCR, but I feel those are things that should be processed by `preserves_flags` rather than `clobber_abi` if we need to do something about them. (However, PPCRegisterInfo.td in LLVM does not seem to define anything about them.) Replaces #111335 and #124279 cc `@ecnelises` `@bzEq` `@lu-zero` r? `@Amanieu` `@rustbot` label +O-PowerPC +A-inline-assembly [^r13]: callee-saved, according to [LLVM](`6a6af0246b/llvm/lib/Target/PowerPC/PPCCallingConv.td (L322)`) and [GCC](`a9173a50e7/gcc/config/rs6000/rs6000.h (L859)`).	2024-11-05 03:13:47 +00:00
Ralf Jung	ffad9aac27	mark some target features as 'forbidden' so they cannot be (un)set For now, this is just a warning, but should become a hard error in the future	2024-11-04 22:56:47 +01:00
bjorn3	9e6d2da83d	Reduce dependence on the target name The target name can be anything with custom target specs. Matching on fields inside the target spec is much more robust than matching on the target name.	2024-11-03 18:29:01 +00:00
Noratrieb	a26450cf81	Rename target triple to target tuple in many places in the compiler This changes the naming to the new naming, used by `--print target-tuple`. It does not change all locations, but many.	2024-11-02 21:29:59 +01:00
Taiki Endo	d19517dcd0	Support clobber_abi and vector registers (clobber-only) in PowerPC inline assembly	2024-11-02 20:26:08 +09:00
Jubilee Young	0349209901	cg_gcc: `rustc_abi::Abi` => `BackendRepr`	2024-10-29 15:01:01 -07:00
Deadbeef	f6fea83342	Effects cleanup - removed extra bits from predicates queries that are no longer needed in the new system - removed the need for `non_erasable_generics` to take in tcx and DefId, removed unused arguments in callers	2024-10-26 10:19:07 +08:00
Zalathar	8f07514520	coverage: SSA doesn't need to know about `instrprof_increment`	2024-10-25 14:24:05 +11:00
Josh Triplett	ecdc2441b6	"innermost", "outermost", "leftmost", and "rightmost" don't need hyphens These are all standard dictionary words and don't require hyphenation.	2024-10-23 02:45:24 -07:00
Jubilee	fe2cbbd2d5	Rollup merge of #130432 - azhogin:azhogin/regparm, r=workingjubilee,pnkfelix rust_for_linux: -Zregparm=<N> commandline flag for X86 (#116972) Command line flag `-Zregparm=<N>` for X86 (32-bit) for rust-for-linux: https://github.com/rust-lang/rust/issues/116972 Implemented in the similar way as fastcall/vectorcall support (args are marked InReg if fit).	2024-10-21 20:32:00 -07:00
Michael Goulet	5cf8107aa6	Fix tests	2024-10-19 18:07:35 +00:00
Andrew Zhogin	b3ae64d24f	rust_for_linux: -Zregparm=<N> commandline flag for X86 (#116972 )	2024-10-18 00:29:31 +07:00
Jed Brown	0d8a978e8a	intrinsics.fmuladdf{16,32,64,128}: expose llvm.fmuladd.* semantics Add intrinsics `fmuladd{f16,f32,f64,f128}`. This computes `(a * b) + c`, to be fused if the code generator determines that (i) the target instruction set has support for a fused operation, and (ii) that the fused operation is more efficient than the equivalent, separate pair of `mul` and `add` instructions. https://llvm.org/docs/LangRef.html#llvm-fmuladd-intrinsic MIRI support is included for f32 and f64. The codegen_cranelift uses the `fma` function from libc, which is a correct implementation, but without the desired performance semantic. I think this requires an update to cranelift to expose a suitable instruction in its IR. I have not tested with codegen_gcc, but it should behave the same way (using `fma` from libc).	2024-10-11 15:32:56 -06:00
Matthias Krüger	13976f1f25	Rollup merge of #130308 - davidtwco:tied-target-consolidation, r=wesleywiser codegen_ssa: consolidate tied target checks Fixes #105110. Fixes #105111. `rustc_codegen_llvm` and `rustc_codegen_gcc` duplicated logic for checking if tied target features were partially enabled. This PR consolidates these checks into `rustc_codegen_ssa` in the `codegen_fn_attrs` query, which also is run pre-monomorphisation for each function, which ensures that this check is run for unused functions, as would be expected. Also adds a test confirming that enabling one tied feature doesn't imply another - the appropriate error for this was already being emitted. I did a bisect and narrowed it down to two patches it was likely to be - something in #128796, probably #128221 or #128679.	2024-10-10 22:00:45 +02:00
Jubilee Young	d92aee556d	cg_gcc: Factor out rustc_target::abi	2024-10-08 18:24:56 -07:00
Urgau	018ba0528f	Use wide pointers consistenly across the compiler	2024-10-04 14:06:48 +02:00
bors	06bb8364aa	Auto merge of #131111 - matthiaskrgr:rollup-n6do187, r=matthiaskrgr Rollup of 4 pull requests Successful merges: - #130005 (Replace -Z default-hidden-visibility with -Z default-visibility) - #130229 (ptr::add/sub: do not claim equivalence with `offset(c as isize)`) - #130773 (Update Unicode escapes in `/library/core/src/char/methods.rs`) - #130933 (rustdoc: lists items that contain multiple paragraphs are more clear) r? `@ghost` `@rustbot` modify labels: rollup	2024-10-01 19:29:26 +00:00
Guillaume Gomez	344b6a1668	Rollup merge of #130630 - taiki-e:s390x-clobber-abi, r=Amanieu Support clobber_abi and vector/access registers (clobber-only) in s390x inline assembly This supports `clobber_abi` which is one of the requirements of stabilization mentioned in #93335. This also supports vector registers (as `vreg`) and access registers (as `areg`) as clobber-only, which need to support clobbering of them to implement clobber_abi. Refs: - "1.2.1.1. Register Preservation Rules" section in ELF Application Binary Interface s390x Supplement, Version 1.6.1 (lzsabi_s390x.pdf in https://github.com/IBM/s390x-abi/releases/tag/v1.6.1) - Register definition in LLVM: - Vector registers https://github.com/llvm/llvm-project/blob/llvmorg-19.1.0/llvm/lib/Target/SystemZ/SystemZRegisterInfo.td#L249 - Access registers https://github.com/llvm/llvm-project/blob/llvmorg-19.1.0/llvm/lib/Target/SystemZ/SystemZRegisterInfo.td#L332 I have three questions: - ~~ELF Application Binary Interface s390x Supplement says that `cc` (condition code, bits 18-19 of PSW) is "Volatile". However, we do not have a register class for `cc` and instead mark `cc` as clobbered unless `preserves_flags` is specified (https://github.com/rust-lang/rust/pull/111331). Therefore, in the current implementation, if both `preserves_flags` and `clobber_abi` are specified, `cc` is not marked as clobbered. Is this okay? Or even if `preserves_flags` is used, should `cc` be marked as clobbered if `clobber_abi` is used?~~ UPDATE: resolved https://github.com/rust-lang/rust/pull/130630#issuecomment-2367923121 - ~~ELF Application Binary Interface s390x Supplement says that `pm` (program mask, bits 20-23 of PSW) is "Cleared". There does not appear to be any registers associated with this in either [LLVM](https://github.com/llvm/llvm-project/blob/llvmorg-19.1.0/llvm/lib/Target/SystemZ/SystemZRegisterInfo.td) or [GCC](`33ccc1314d/gcc/config/s390/s390.h (L407-L431)`), so at this point I don't see any way other than to just ignore it. Is this okay as-is?~~ UPDATE: resolved https://github.com/rust-lang/rust/pull/130630#issuecomment-2367923121 - Is "areg" a good name for register class name for access registers? It may be a bit confusing between that and `reg_addr`, which uses the “a” constraint (https://github.com/rust-lang/rust/pull/119431)... Note: - GCC seems to [recognize only `a0` and `a1`](`33ccc1314d/gcc/config/s390/s390.h (L428-L429)`), and using `a[2-15]` [causes errors](https://godbolt.org/z/a46vx8jjn). Given that cg_gcc has a similar problem with other architecture (https://github.com/rust-lang/rustc_codegen_gcc/issues/485), I don't feel this is a blocker for this PR, but it is worth mentioning here. - `vreg` should be able to accept `#[repr(simd)]` types as input if the `vector` target feature added in https://github.com/rust-lang/rust/pull/127506 is enabled, but core_arch has no s390x vector type and both `#[repr(simd)]` and `core::simd` are unstable, so I have not implemented it in this PR. EDIT: And supporting it is probably more complex than doing the equivalent on other architectures... https://github.com/rust-lang/rust/pull/88245#issuecomment-905559591 cc `@uweigand` r? `@Amanieu` `@rustbot` label +O-SystemZ	2024-10-01 17:32:07 +02:00
David Lattimore	f48194ea55	Replace -Z default-hidden-visibility with -Z default-visibility MCP: https://github.com/rust-lang/compiler-team/issues/782 Co-authored-by: bjorn3 <17426603+bjorn3@users.noreply.github.com>	2024-10-01 22:32:13 +10:00
Guillaume Gomez	7cde7db1fc	Fmt	2024-09-27 22:09:18 +02:00
Guillaume Gomez	325b70890a	Merge commit '3187d32079b817522cc17413ec9185b130daf693' into subtree-update	2024-09-27 22:00:17 +02:00
David Wood	207bc77e15	codegen_ssa: consolidate tied feature checking `rustc_codegen_llvm` and `rustc_codegen_gcc` duplicated logic for checking if tied target features were partially enabled. This commit consolidates these checks into `rustc_codegen_ssa` in the `codegen_fn_attrs` query, which also is run pre-monomorphisation for each function, which ensures that this check is run for unused functions, as would be expected.	2024-09-24 15:48:49 +01:00
bors	4cbfcf1b7f	Auto merge of #130389 - Luv-Ray:LLVMMDNodeInContext2, r=nikic llvm: replace some deprecated functions `LLVMMDStringInContext` and `LLVMMDNodeInContext` are deprecated, replace them with `LLVMMDStringInContext2` and `LLVMMDNodeInContext2`. Also replace `Value` with `Metadata` in some function signatures for better consistency.	2024-09-24 12:07:48 +00:00
Michael Goulet	702a644b74	Check vtable projections for validity in miri	2024-09-23 19:38:26 -04:00
Michael Goulet	c682aa162b	Reformat using the new identifier sorting from rustfmt	2024-09-22 19:11:29 -04:00
Taiki Endo	fa125e2be6	Support clobber_abi and vector/access registers (clobber-only) in s390x inline assembly	2024-09-21 01:51:26 +09:00
Nicholas Nethercote	bfef2611d9	Reorder `ConstMethods`. It's crazy to have the integer methods in something close to random order. The reordering makes the gaps clear: `const_i64`, `const_i128`, `const_isize`, and `const_u16`. I guess they just aren't needed.	2024-09-19 20:10:42 +10:00
Luv-Ray	b7c5656713	replace some deprecated functions	2024-09-19 09:39:28 +08:00
Nicholas Nethercote	acb832d640	Use associative type defaults in `{Layout,FnAbi}OfHelpers`. This avoids some repetitive boilerplate code.	2024-09-17 10:25:06 +10:00
Nicholas Nethercote	a8d22eb39e	Rename supertraits of `CodegenMethods`. Supertraits of `BuilderMethods` are all called `XyzBuilderMethods`. Supertraits of `CodegenMethods` are all called `XyzMethods`. This commit changes the latter to `XyzCodegenMethods`, for consistency.	2024-09-17 10:24:43 +10:00
Nicholas Nethercote	410a2de0c0	Rename `{ArgAbi,IntrinsicCall}Methods`. They both are part of `BuilderMethods`, and so should have `Builder` in their name like all the other traits in `BuilderMethods`.	2024-09-17 10:24:43 +10:00
Nicholas Nethercote	5f98943b5a	Merge `HasCodegen` into `BuilderMethods`. It has `Backend` and `Deref` boudns, plus an associated type `CodegenCx`, and it has a single use. This commit "inlines" it into `BuilderMethods`, which makes the complicated backend trait situation a little simpler.	2024-09-17 10:24:43 +10:00
Ralf Jung	60ee1b7ac6	simd_shuffle: require index argument to be a vector	2024-09-14 14:43:24 +02:00
bors	5e842953cc	Auto merge of #130052 - khuey:clear-dilocation-after-const-emission, r=michaelwoerister Don't leave debug locations for constants sitting on the builder indefinitely Because constants are currently emitted before the prologue, leaving the debug location on the IRBuilder spills onto other instructions in the prologue and messes up both line numbers as well as the point LLVM chooses to be the prologue end. Example LLVM IR (irrelevant IR elided): Before: ``` define internal { i64, i64 } `@_ZN3tmp3Foo18var_return_opt_try17he02116165b0fc08cE(ptr` align 8 %self) !dbg !347 { start: %self.dbg.spill = alloca [8 x i8], align 8 %_0 = alloca [16 x i8], align 8 %residual.dbg.spill = alloca [0 x i8], align 1 #dbg_declare(ptr %residual.dbg.spill, !353, !DIExpression(), !357) store ptr %self, ptr %self.dbg.spill, align 8, !dbg !357 #dbg_declare(ptr %self.dbg.spill, !350, !DIExpression(), !358) ``` After: ``` define internal { i64, i64 } `@_ZN3tmp3Foo18var_return_opt_try17h00b17d08874ddd90E(ptr` align 8 %self) !dbg !347 { start: %self.dbg.spill = alloca [8 x i8], align 8 %_0 = alloca [16 x i8], align 8 %residual.dbg.spill = alloca [0 x i8], align 1 #dbg_declare(ptr %residual.dbg.spill, !353, !DIExpression(), !357) store ptr %self, ptr %self.dbg.spill, align 8 #dbg_declare(ptr %self.dbg.spill, !350, !DIExpression(), !358) ``` Note in particular how !357 from %residual.dbg.spill's dbg_declare no longer falls through onto the store to %self.dbg.spill. This fixes argument values at entry when the constant is a ZST (e.g. `<Option as Try>::Residual`). This fixes #130003 (but note that it does not fix issues with argument values and non-ZST constants, which emit their own stores that have debug info on them, like #128945). r? `@michaelwoerister`	2024-09-13 08:57:41 +00:00
Jubilee	88a2c62652	Rollup merge of #129981 - nnethercote:rm-serialize_bitcode, r=antoyo,tmiasko Remove `serialized_bitcode` from `LtoModuleCodegen`. It's unused. r? ``@bjorn3``	2024-09-09 19:20:36 -07:00
Nicholas Nethercote	bbe28cf1d9	Remove `serialized_bitcode` from `LtoModuleCodegen`. It's unused.	2024-09-09 09:00:50 +10:00
Kyle Huey	7ed9f945a2	Don't leave debug locations for constants sitting on the builder indefinitely. Because constants are currently emitted before the prologue, leaving the debug location on the IRBuilder spills onto other instructions in the prologue and messes up both line numbers as well as the point LLVM chooses to be the prologue end. Example LLVM IR (irrelevant IR elided): Before: define internal { i64, i64 } @_ZN3tmp3Foo18var_return_opt_try17he02116165b0fc08cE(ptr align 8 %self) !dbg !347 { start: %self.dbg.spill = alloca [8 x i8], align 8 %_0 = alloca [16 x i8], align 8 %residual.dbg.spill = alloca [0 x i8], align 1 #dbg_declare(ptr %residual.dbg.spill, !353, !DIExpression(), !357) store ptr %self, ptr %self.dbg.spill, align 8, !dbg !357 #dbg_declare(ptr %self.dbg.spill, !350, !DIExpression(), !358) After: define internal { i64, i64 } @_ZN3tmp3Foo18var_return_opt_try17h00b17d08874ddd90E(ptr align 8 %self) !dbg !347 { start: %self.dbg.spill = alloca [8 x i8], align 8 %_0 = alloca [16 x i8], align 8 %residual.dbg.spill = alloca [0 x i8], align 1 #dbg_declare(ptr %residual.dbg.spill, !353, !DIExpression(), !357) store ptr %self, ptr %self.dbg.spill, align 8 #dbg_declare(ptr %self.dbg.spill, !350, !DIExpression(), !358) Note in particular how !357 from %residual.dbg.spill's dbg_declare no longer falls through onto the store to %self.dbg.spill. This fixes argument values at entry when the constant is a ZST (e.g. <Option as Try>::Residual). This fixes #130003 (but note that it does not fix issues with argument values and non-ZST constants, which emit their own stores that have debug info on them, like #128945).	2024-09-06 23:12:18 +00:00
Nicholas Nethercote	747f68091a	Use sysroot crates maximally in `rustc_codegen_gcc`. This shrinks `compiler/rustc_codegen_gcc/Cargo.lock` quite a bit. The only remaining dependencies in `compiler/rustc_codegen_gcc/Cargo.lock` are `gccjit`, `lang_tester`, and `boml`, all of which aren't used in any other compiler crates. The commit also reorders and adds comments to the `extern crate` items so they match those in miri.	2024-09-02 15:35:58 +10:00
Trevor Gross	d2ff033302	Rollup merge of #128731 - RalfJung:simd-shuffle-vector, r=workingjubilee simd_shuffle intrinsic: allow argument to be passed as vector See https://github.com/rust-lang/rust/issues/128738 for context. I'd like to get rid of [this hack](`6c0b89dfac/compiler/rustc_codegen_ssa/src/mir/block.rs (L922-L935)`). https://github.com/rust-lang/rust/pull/128537 almost lets us do that since constant SIMD vectors will then be passed as immediate arguments. However, simd_shuffle for some reason actually takes an array as argument, not a vector, so the hack is still required to ensure that the array becomes an immediate (which then later stages of codegen convert into a vector, as that's what LLVM needs). This PR prepares simd_shuffle to also support a vector as the `idx` argument. Once this lands, stdarch can hopefully be updated to pass `idx` as a vector, and then support for arrays can be removed, which finally lets us get rid of that hack.	2024-08-27 01:46:50 -05:00
bors	e9c965df7b	Auto merge of #128812 - nnethercote:shrink-TyKind-FnPtr, r=compiler-errors Shrink `TyKind::FnPtr`. By splitting the `FnSig` within `TyKind::FnPtr` into `FnSigTys` and `FnHeader`, which can be packed more efficiently. This reduces the size of the hot `TyKind` type from 32 bytes to 24 bytes on 64-bit platforms. This reduces peak memory usage by a few percent on some benchmarks. It also reduces cache misses and page faults similarly, though this doesn't translate to clear cycles or wall-time improvements on CI. r? `@compiler-errors`	2024-08-14 00:56:53 +00:00
Ralf Jung	daedbd4d7a	make the GCC backend compatible with vector shuffle indices	2024-08-13 07:51:28 +02:00
Ralf Jung	194baa820d	simd_shuffle intrinsic: allow argument to be passed as vector (not just as array)	2024-08-13 07:51:17 +02:00
Guillaume Gomez	aea5087964	Rollup merge of #128537 - Jamesbarford:118980-const-vector, r=RalfJung,nikic const vector passed through to codegen This allows constant vectors using a repr(simd) type to be propagated through to the backend by reusing the functionality used to do a similar thing for the simd_shuffle intrinsic #118209 r? RalfJung	2024-08-12 17:09:15 +02:00
Guillaume Gomez	095ca33bb6	Rollup merge of #128149 - RalfJung:nontemporal_store, r=jieyouxu,Amanieu,Jubilee nontemporal_store: make sure that the intrinsic is truly just a hint The `!nontemporal` flag for stores in LLVM sounds like it is just a hint, but actually, it is not -- at least on x86, non-temporal stores need very special treatment by the programmer or else the Rust memory model breaks down. LLVM still treats these stores as-if they were normal stores for optimizations, which is [highly dubious](https://github.com/llvm/llvm-project/issues/64521). Let's avoid all that dubiousness by making our own non-temporal stores be truly just a hint, which is possible on some targets (e.g. ARM). On all other targets, non-temporal stores become regular stores. ~~Blocked on https://github.com/rust-lang/stdarch/pull/1541 propagating to the rustc repo, to make sure the `_mm_stream` intrinsics are unaffected by this change.~~ Fixes https://github.com/rust-lang/rust/issues/114582 Cc `@Amanieu` `@workingjubilee`	2024-08-12 17:09:14 +02:00
Nicholas Nethercote	c4717cc9d1	Shrink `TyKind::FnPtr`. By splitting the `FnSig` within `TyKind::FnPtr` into `FnSigTys` and `FnHeader`, which can be packed more efficiently. This reduces the size of the hot `TyKind` type from 32 bytes to 24 bytes on 64-bit platforms. This reduces peak memory usage by a few percent on some benchmarks. It also reduces cache misses and page faults similarly, though this doesn't translate to clear cycles or wall-time improvements on CI.	2024-08-09 14:33:25 +10:00
James Barford-Evans	27ca35aa1b	const vector passed to codegen	2024-08-08 11:15:03 +01:00
Matthias Krüger	904f5795a0	Rollup merge of #128221 - calebzulawski:implied-target-features, r=Amanieu Add implied target features to target_feature attribute See [zulip](https://rust-lang.zulipchat.com/#narrow/stream/208962-t-libs.2Fstdarch/topic/Why.20would.20target-feature.20include.20implied.20features.3F) for some context. Adds implied target features, e.g. `#[target_feature(enable = "avx2")]` acts like `#[target_feature(enable = "avx2,avx,sse4.2,sse4.1...")]`. Fixes #128125, fixes #128426 The implied feature sets are taken from [the rust reference](https://doc.rust-lang.org/reference/attributes/codegen.html?highlight=target-fea#x86-or-x86_64), there are certainly more features and targets to add. Please feel free to reassign this to whoever should review it. r? ``@Amanieu``	2024-08-07 20:28:16 +02:00
Caleb Zulawski	83276f5680	Hide implicit target features from diagnostics when possible	2024-08-07 00:43:52 -04:00

1 2 3 4 5 ...

433 Commits