mirror of
https://github.com/rust-lang/rust.git
synced 2024-11-25 08:13:41 +00:00
0d410be23c
optimize zipping over array iterators Fixes #115339 (somewhat) the new assembly: ```asm zip_arrays: .cfi_startproc vmovups (%rdx), %ymm0 leaq 32(%rsi), %rcx vxorps %xmm1, %xmm1, %xmm1 vmovups %xmm1, -24(%rsp) movq $0, -8(%rsp) movq %rsi, -88(%rsp) movq %rdi, %rax movq %rcx, -80(%rsp) vmovups %ymm0, -72(%rsp) movq $0, -40(%rsp) movq $32, -32(%rsp) movq -24(%rsp), %rcx vmovups (%rsi,%rcx), %ymm0 vorps -72(%rsp,%rcx), %ymm0, %ymm0 vmovups %ymm0, (%rsi,%rcx) vmovups (%rsi), %ymm0 vmovups %ymm0, (%rdi) vzeroupper retq ``` This is still longer than the slice version given in the issue but at least it eliminates the terrible `vpextrb`/`orb` chain. I guess this is due to excessive memcpys again (haven't looked at the llvmir)? The `TrustedLen` specialization is a drive-by change since I had to do something for the default impl anyway to be able to specialize the `TrustedRandomAccessNoCoerce` impl. |
||
---|---|---|
.. | ||
alloc | ||
backtrace@99faef833f | ||
core | ||
panic_abort | ||
panic_unwind | ||
portable-simd | ||
proc_macro | ||
profiler_builtins | ||
rtstartup | ||
rustc-std-workspace-alloc | ||
rustc-std-workspace-core | ||
rustc-std-workspace-std | ||
std | ||
stdarch@333e9e9977 | ||
sysroot | ||
test | ||
unwind |