rust/compiler
bors f6fa358a18 Auto merge of #127226 - mat-1:optimize-siphash-round, r=nnethercote
Optimize SipHash by reordering compress instructions

This PR optimizes hashing by changing the order of instructions in the sip.rs `compress` macro so the CPU can parallelize it better. The new order is taken directly from Fig 2.1 in [the SipHash paper](https://eprint.iacr.org/2012/351.pdf) (but with the xors moved which makes it a little faster). I attempted to optimize it some more after this, but I think this might be the optimal instruction order. Note that this shouldn't change the behavior of hashing at all, only statements that don't depend on each other were reordered.

It appears like the current order hasn't changed since its [original implementation from 2012](fada46c421 (diff-b751133c229259d7099bbbc7835324e5504b91ab1aded9464f0c48cd22e5e420R35)) which doesn't look like it was written with data dependencies in mind.

Running `./x bench library/core --stage 0 --test-args hash` before and after this change shows the following results:

Before:
```
benchmarks:
    hash::sip::bench_bytes_4             7.20/iter +/- 0.70
    hash::sip::bench_bytes_7             9.01/iter +/- 0.35
    hash::sip::bench_bytes_8             8.12/iter +/- 0.10
    hash::sip::bench_bytes_a_16         10.07/iter +/- 0.44
    hash::sip::bench_bytes_b_32         13.46/iter +/- 0.71
    hash::sip::bench_bytes_c_128        37.75/iter +/- 0.48
    hash::sip::bench_long_str          121.18/iter +/- 3.01
    hash::sip::bench_str_of_8_bytes     11.20/iter +/- 0.25
    hash::sip::bench_str_over_8_bytes   11.20/iter +/- 0.26
    hash::sip::bench_str_under_8_bytes   9.89/iter +/- 0.59
    hash::sip::bench_u32                 9.57/iter +/- 0.44
    hash::sip::bench_u32_keyed           6.97/iter +/- 0.10
    hash::sip::bench_u64                 8.63/iter +/- 0.07
```
After:
```
benchmarks:
    hash::sip::bench_bytes_4             6.64/iter +/- 0.14
    hash::sip::bench_bytes_7             8.19/iter +/- 0.07
    hash::sip::bench_bytes_8             8.59/iter +/- 0.68
    hash::sip::bench_bytes_a_16          9.73/iter +/- 0.49
    hash::sip::bench_bytes_b_32         12.70/iter +/- 0.06
    hash::sip::bench_bytes_c_128        32.38/iter +/- 0.20
    hash::sip::bench_long_str          102.99/iter +/- 0.82
    hash::sip::bench_str_of_8_bytes     10.71/iter +/- 0.21
    hash::sip::bench_str_over_8_bytes   11.73/iter +/- 0.17
    hash::sip::bench_str_under_8_bytes  10.33/iter +/- 0.41
    hash::sip::bench_u32                10.41/iter +/- 0.29
    hash::sip::bench_u32_keyed           9.50/iter +/- 0.30
    hash::sip::bench_u64                 8.44/iter +/- 1.09
```
I ran this on my computer so there's some noise, but you can tell at least `bench_long_str` is significantly faster (~18%).

Also, I noticed the same compress function from the library is used in the compiler as well, so I took the liberty of copy-pasting this change to there as well.

Thanks `@semisol` for porting SipHash for another project which led me to notice this issue in Rust, and for helping investigate. <3
2024-07-04 04:03:45 +00:00
..
rustc
rustc_abi Auto merge of #126326 - eggyal:ununsafe-StableOrd, r=michaelwoerister 2024-06-25 15:51:35 +00:00
rustc_arena Use tidy to sort crate attributes for all compiler crates. 2024-06-12 15:49:10 +10:00
rustc_ast Rollup merge of #127092 - compiler-errors:rtn-dots-redux, r=estebank 2024-07-03 23:30:07 +02:00
rustc_ast_ir Use tidy to sort crate attributes for all compiler crates. 2024-06-12 15:49:10 +10:00
rustc_ast_lowering Rollup merge of #127092 - compiler-errors:rtn-dots-redux, r=estebank 2024-07-03 23:30:07 +02:00
rustc_ast_passes Rollup merge of #127092 - compiler-errors:rtn-dots-redux, r=estebank 2024-07-03 23:30:07 +02:00
rustc_ast_pretty Rollup merge of #127092 - compiler-errors:rtn-dots-redux, r=estebank 2024-07-03 23:30:07 +02:00
rustc_attr Use a dedicated type instead of a reference for the diagnostic context 2024-06-18 15:42:11 +00:00
rustc_baked_icu_data Use tidy to sort crate attributes for all compiler crates. 2024-06-12 15:49:10 +10:00
rustc_borrowck Auto merge of #125507 - compiler-errors:type-length-limit, r=lcnr 2024-07-03 11:56:36 +00:00
rustc_builtin_macros Simplify CfgEval. 2024-07-02 10:46:43 +10:00
rustc_codegen_cranelift Fix spans 2024-07-02 15:48:48 -04:00
rustc_codegen_gcc Fix spans 2024-07-02 15:48:48 -04:00
rustc_codegen_llvm Rollup merge of #126803 - tgross35:verbose-asm, r=Amanieu 2024-07-03 17:26:53 +02:00
rustc_codegen_ssa Auto merge of #126094 - petrochenkov:libsearch, r=michaelwoerister 2024-07-03 14:15:31 +00:00
rustc_const_eval Auto merge of #125507 - compiler-errors:type-length-limit, r=lcnr 2024-07-03 11:56:36 +00:00
rustc_data_structures Auto merge of #127226 - mat-1:optimize-siphash-round, r=nnethercote 2024-07-04 04:03:45 +00:00
rustc_driver Use tidy to sort crate attributes for all compiler crates. 2024-06-12 15:49:10 +10:00
rustc_driver_impl Move codegen_and_build_linker from Queries to Linker 2024-07-01 11:00:49 +00:00
rustc_error_codes Auto merge of #126319 - workingjubilee:rollup-lendnud, r=workingjubilee 2024-06-12 11:10:50 +00:00
rustc_error_messages Use tidy to sort crate attributes for all compiler crates. 2024-06-12 15:49:10 +10:00
rustc_errors Auto merge of #126996 - oli-obk:do_not_count_errors, r=nnethercote 2024-07-01 06:35:58 +00:00
rustc_expand Shrink parser positions from usize to u32. 2024-07-02 17:03:53 +10:00
rustc_feature add rustc_dump_def_parents attribute 2024-06-30 19:31:21 +01:00
rustc_fluent_macro Use tidy to sort crate attributes for all compiler crates. 2024-06-12 15:49:10 +10:00
rustc_fs_util Remove useless tidy-alphabetical markers. 2024-06-20 09:23:20 +10:00
rustc_graphviz Use tidy to sort crate attributes for all compiler crates. 2024-06-12 15:49:10 +10:00
rustc_hir Rollup merge of #127092 - compiler-errors:rtn-dots-redux, r=estebank 2024-07-03 23:30:07 +02:00
rustc_hir_analysis Rollup merge of #127181 - BoxyUwU:dump_def_parents, r=compiler-errors 2024-07-01 08:53:07 +02:00
rustc_hir_pretty implement new effects desugaring 2024-06-28 10:57:35 +00:00
rustc_hir_typeck Rollup merge of #127253 - chenyukang:yukang-fix-126246-fn-parameters-check, r=estebank 2024-07-03 23:30:08 +02:00
rustc_incremental Use tidy to sort crate attributes for all compiler crates. 2024-06-12 15:49:10 +10:00
rustc_index Use tidy to sort crate attributes for all compiler crates. 2024-06-12 15:49:10 +10:00
rustc_index_macros rustc_span: Minor improvements 2024-06-16 14:08:25 +03:00
rustc_infer Instance::resolve -> Instance::try_resolve, and other nits 2024-07-02 17:28:03 -04:00
rustc_interface Rollup merge of #127184 - bjorn3:interface_refactor2, r=Nadrieril 2024-07-03 23:30:07 +02:00
rustc_lexer Use tidy to sort crate attributes for all compiler crates. 2024-06-12 15:49:10 +10:00
rustc_lint Instance::resolve -> Instance::try_resolve, and other nits 2024-07-02 17:28:03 -04:00
rustc_lint_defs Ensure out_of_scope_macro_calls lint is registered 2024-07-01 00:25:25 +01:00
rustc_llvm Rename the asm-comments compiler flag to verbose-asm 2024-07-02 21:42:01 -04:00
rustc_log Bump tracing-tree and allow rendering lines again 2024-06-12 10:11:41 +00:00
rustc_macros Remove redundant argument from subdiagnostic method 2024-06-18 15:42:11 +00:00
rustc_metadata Auto merge of #120639 - fee1-dead-contrib:new-effects-desugaring, r=oli-obk 2024-06-29 20:08:10 +00:00
rustc_middle Rollup merge of #127294 - ldm0:ldm_coroutine2, r=lcnr 2024-07-03 23:30:10 +02:00
rustc_mir_build Auto merge of #125507 - compiler-errors:type-length-limit, r=lcnr 2024-07-03 11:56:36 +00:00
rustc_mir_dataflow Auto merge of #127036 - cjgillot:sparse-state, r=oli-obk 2024-07-03 18:52:04 +00:00
rustc_mir_transform Rollup merge of #127294 - ldm0:ldm_coroutine2, r=lcnr 2024-07-03 23:30:10 +02:00
rustc_monomorphize Fix spans 2024-07-02 15:48:48 -04:00
rustc_next_trait_solver Rollup merge of #127145 - compiler-errors:as_lang_item, r=lcnr 2024-07-03 17:26:54 +02:00
rustc_parse Rollup merge of #127092 - compiler-errors:rtn-dots-redux, r=estebank 2024-07-03 23:30:07 +02:00
rustc_parse_format Use tidy to sort crate attributes for all compiler crates. 2024-06-12 15:49:10 +10:00
rustc_passes Rollup merge of #127092 - compiler-errors:rtn-dots-redux, r=estebank 2024-07-03 23:30:07 +02:00
rustc_pattern_analysis Replace f16 and f128 pattern matching stubs with real implementations 2024-06-23 04:28:42 -05:00
rustc_privacy Do not ICE in privacy when type inference fails. 2024-06-17 10:09:27 +00:00
rustc_query_impl Allow tracing through item_bounds query invocations on opaques 2024-06-19 08:47:55 +00:00
rustc_query_system Auto merge of #126326 - eggyal:ununsafe-StableOrd, r=michaelwoerister 2024-06-25 15:51:35 +00:00
rustc_resolve Auto merge of #127127 - notriddle:notriddle/pulldown-cmark-0.11, r=GuillaumeGomez 2024-07-04 01:50:31 +00:00
rustc_sanitizers Split out IntoIterator and non-Iterator constructors for AliasTy/AliasTerm/TraitRef/projection 2024-06-24 11:28:21 -04:00
rustc_serialize chore: remove duplicate words 2024-07-02 11:25:31 +08:00
rustc_session Rename the asm-comments compiler flag to verbose-asm 2024-07-02 21:42:01 -04:00
rustc_smir Instance::resolve -> Instance::try_resolve, and other nits 2024-07-02 17:28:03 -04:00
rustc_span add rustc_dump_def_parents attribute 2024-06-30 19:31:21 +01:00
rustc_symbol_mangling Fix FnMut/Fn shim for coroutine-closures that capture references 2024-06-29 17:38:02 -04:00
rustc_target Use the aligned size for alloca at args when the pass mode is cast. 2024-07-02 06:33:35 +08:00
rustc_trait_selection Auto merge of #125507 - compiler-errors:type-length-limit, r=lcnr 2024-07-03 11:56:36 +00:00
rustc_traits Use tidy to sort crate attributes for all compiler crates. 2024-06-12 15:49:10 +10:00
rustc_transmute safe transmute: support non-ZST, variantful, uninhabited enums 2024-06-14 21:11:08 +00:00
rustc_ty_utils Auto merge of #125507 - compiler-errors:type-length-limit, r=lcnr 2024-07-03 11:56:36 +00:00
rustc_type_ir Rollup merge of #127145 - compiler-errors:as_lang_item, r=lcnr 2024-07-03 17:26:54 +02:00
rustc_type_ir_macros Uplift TraitPredicate 2024-05-11 18:20:00 -04:00
stable_mir Add method to get all attributes on a definition 2024-06-28 13:24:41 +08:00