rust/tests at 7c385f5a03c08df98aca71fbe4ef57dff66ffa56 - rust

mirror of https://github.com/rust-lang/rust.git synced 2024-11-22 14:55:26 +00:00

History

bors fd9bf59436 Auto merge of #111999 - scottmcm:codegen-less-memcpy, r=compiler-errors Use `load`+`store` instead of `memcpy` for small integer arrays I was inspired by #98892 to see whether, rather than making `mem::swap` do something smart in the library, we could update MIR assignments like `_1 = _2` to do something smarter than `memcpy` for sufficiently-small types that doing it inline is going to be better than a `memcpy` call in assembly anyway. After all, special code may help `mem::swap`, but if the "obvious" MIR can just result in the correct thing that helps everything -- other code like `mem::replace`, people doing it manually, and just passing around by value in general -- as well as makes MIR inlining happier since it doesn't need to deal with all the complicated library code if it just sees a couple assignments. LLVM will turn the short, known-length `memcpy`s into direct instructions in the backend, but that's too late for it to be able to remove `alloca`s. In general, replacing `memcpy`s with typed instructions is hard in the middle-end -- even for `memcpy.inline` where it knows it won't be a function call -- is hard [due to poison propagation issues](https://rust-lang.zulipchat.com/#narrow/stream/187780-t-compiler.2Fwg-llvm/topic/memcpy.20vs.20load-store.20for.20MIR.20assignments/near/360376712). So because we know more about the type invariants -- these are typed copies -- rustc can emit something more specific, allowing LLVM to `mem2reg` away the `alloca`s in some situations. #52051 previously did something like this in the library for `mem::swap`, but it ended up regressing during enabling mir inlining (`cbbf06b0cd`), so this has been suboptimal on stable for ≈5 releases now. The code in this PR is narrowly targeted at just integer arrays in LLVM, but works via a new method on the [`LayoutTypeMethods`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/traits/trait.LayoutTypeMethods.html) trait, so specific backends based on cg_ssa can enable this for more situations over time, as we find them. I don't want to try to bite off too much in this PR, though. (Transparent newtypes and simple things like the 3×usize `String` would be obvious candidates for a follow-up.) Codegen demonstrations: <https://llvm.godbolt.org/z/fK8hT9aqv> Before: ```llvm define void `@swap_rgb48_old(ptr` noalias nocapture noundef align 2 dereferenceable(6) %x, ptr noalias nocapture noundef align 2 dereferenceable(6) %y) unnamed_addr #1 { %a.i = alloca [3 x i16], align 2 call void `@llvm.lifetime.start.p0(i64` 6, ptr nonnull %a.i) call void `@llvm.memcpy.p0.p0.i64(ptr` noundef nonnull align 2 dereferenceable(6) %a.i, ptr noundef nonnull align 2 dereferenceable(6) %x, i64 6, i1 false) tail call void `@llvm.memcpy.p0.p0.i64(ptr` noundef nonnull align 2 dereferenceable(6) %x, ptr noundef nonnull align 2 dereferenceable(6) %y, i64 6, i1 false) call void `@llvm.memcpy.p0.p0.i64(ptr` noundef nonnull align 2 dereferenceable(6) %y, ptr noundef nonnull align 2 dereferenceable(6) %a.i, i64 6, i1 false) call void `@llvm.lifetime.end.p0(i64` 6, ptr nonnull %a.i) ret void } ``` Note it going to stack: ```nasm swap_rgb48_old: # `@swap_rgb48_old` movzx eax, word ptr [rdi + 4] mov word ptr [rsp - 4], ax mov eax, dword ptr [rdi] mov dword ptr [rsp - 8], eax movzx eax, word ptr [rsi + 4] mov word ptr [rdi + 4], ax mov eax, dword ptr [rsi] mov dword ptr [rdi], eax movzx eax, word ptr [rsp - 4] mov word ptr [rsi + 4], ax mov eax, dword ptr [rsp - 8] mov dword ptr [rsi], eax ret ``` Now: ```llvm define void `@swap_rgb48(ptr` noalias nocapture noundef align 2 dereferenceable(6) %x, ptr noalias nocapture noundef align 2 dereferenceable(6) %y) unnamed_addr #0 { start: %0 = load <3 x i16>, ptr %x, align 2 %1 = load <3 x i16>, ptr %y, align 2 store <3 x i16> %1, ptr %x, align 2 store <3 x i16> %0, ptr %y, align 2 ret void } ``` still lowers to `dword`+`word` operations, but has no stack traffic: ```nasm swap_rgb48: # `@swap_rgb48` mov eax, dword ptr [rdi] movzx ecx, word ptr [rdi + 4] movzx edx, word ptr [rsi + 4] mov r8d, dword ptr [rsi] mov dword ptr [rdi], r8d mov word ptr [rdi + 4], dx mov word ptr [rsi + 4], cx mov dword ptr [rsi], eax ret ``` And as a demonstration that this isn't just `mem::swap`, a `mem::replace` on a small array (since replace doesn't use swap since #83022), which used to be `memcpy`s in LLVM changes in IR ```llvm define void `@replace_short_array(ptr` noalias nocapture noundef sret([3 x i32]) dereferenceable(12) %0, ptr noalias noundef align 4 dereferenceable(12) %r, ptr noalias nocapture noundef readonly dereferenceable(12) %v) unnamed_addr #0 { start: %1 = load <3 x i32>, ptr %r, align 4 store <3 x i32> %1, ptr %0, align 4 %2 = load <3 x i32>, ptr %v, align 4 store <3 x i32> %2, ptr %r, align 4 ret void } ``` but that lowers to reasonable `dword`+`qword` instructions still ```nasm replace_short_array: # `@replace_short_array` mov rax, rdi mov rcx, qword ptr [rsi] mov edi, dword ptr [rsi + 8] mov dword ptr [rax + 8], edi mov qword ptr [rax], rcx mov rcx, qword ptr [rdx] mov edx, dword ptr [rdx + 8] mov dword ptr [rsi + 8], edx mov qword ptr [rsi], rcx ret ```		2023-06-06 01:50:28 +00:00
..
assembly	Fix linkage for large binaries on mips64 platforms ...	2023-05-29 10:57:03 -06:00
auxiliary
codegen	Use `load`-`store` instead of `memcpy` for short integer arrays	2023-06-04 00:51:49 -07:00
codegen-units
debuginfo	Add multiple borrow test.	2023-05-13 10:32:32 +00:00
incremental	Implement custom diagnostic for ConstParamTy	2023-06-01 18:21:42 +00:00
mir-opt	Auto merge of #112240 - cjgillot:recurse-inline, r=scottmcm	2023-06-04 03:39:24 +00:00
pretty	Rollup merge of #111042 - Zalathar:no-coverage, r=wesleywiser	2023-05-01 17:10:24 +02:00
run-make	Auto merge of #86844 - bjorn3:global_alloc_improvements, r=pnkfelix	2023-05-25 16:59:57 +00:00
run-make-fulldeps	Move BodyWithBorrowckFacts to consumers	2023-05-23 14:36:36 +02:00
run-pass-valgrind
rustdoc	Auto merge of #110945 - wackbyte:doc-vis-on-inherent-assoc-types, r=jsha	2023-06-05 04:54:21 +00:00
rustdoc-gui	Migrate GUI colors test to original CSS color format	2023-06-04 15:55:30 +02:00
rustdoc-js	Rollup merge of #110780 - notriddle:notriddle/slice-index, r=GuillaumeGomez	2023-05-06 09:09:31 +09:00
rustdoc-js-std
rustdoc-json	Serialize all enums as externally tagged to guarantee compatibility with binary formats such as bincode or postcard	2023-05-22 18:22:08 +01:00
rustdoc-ui	rustdoc: get unnormalized link destination for suggestions	2023-05-26 18:38:46 -07:00
ui	Auto merge of #112324 - matthiaskrgr:rollup-qscmi3c, r=matthiaskrgr	2023-06-05 22:44:59 +00:00
ui-fulldeps	Use translatable diagnostics in `rustc_const_eval`	2023-06-01 14:45:18 +00:00
COMPILER_TESTS.md