nordic-dev.net/rust - rust

mirror of https://github.com/rust-lang/rust.git synced 2024-11-25 08:13:41 +00:00

Author	SHA1	Message	Date
Arpad Borsos	5fe296c6c5	Add benchmarks for `impl Debug for str` In order to inform future perf improvements and prevent regressions, lets add some benchmarks that stress `impl Debug for str`.	2024-05-01 09:54:29 +02:00
Guillaume Gomez	206e0df78d	Rollup merge of #115913 - FedericoStra:checked_ilog, r=the8472 checked_ilog: improve performance Addresses #115874. (This PR replicates the original #115875, which I accidentally closed by deleting my forked repository...)	2024-04-22 20:25:58 +02:00
Ralf Jung	c0b564b767	disable benches in Miri	2024-04-07 09:58:10 +02:00
okaneco	69637c9212	Add benches for `net` parsing Add benches for IpAddr, Ipv4Addr, Ipv6Addr, SocketAddr, SocketAddrV4, and SocketAddrV6 parsing	2024-03-04 18:46:24 -05:00
bors	fa404339c9	Auto merge of #85528 - the8472:iter-markers, r=dtolnay Implement iterator specialization traits on more adapters This adds * `TrustedLen` to `Skip` and `StepBy` * `TrustedRandomAccess` to `Skip` * `InPlaceIterable` and `SourceIter` to `Copied` and `Cloned` The first two might improve performance in the compiler itself since `skip` is used in several places. Constellations that would exercise the last point are probably rare since it would require an owning iterator that has references as Items somewhere in its iterator pipeline. Improvements for `Skip`: ``` # old test iter::bench_skip_trusted_random_access ... bench: 8,335 ns/iter (+/- 90) # new test iter::bench_skip_trusted_random_access ... bench: 2,753 ns/iter (+/- 27) ```	2024-01-21 11:17:46 +00:00
Matthias Krüger	17c95b6330	Rollup merge of #113142 - the8472:opt-cstr-display, r=Mark-Simulacrum optimize EscapeAscii's Display and CStr's Debug ``` old: ascii::bench_ascii_escape_display_mixed 17.97µs/iter +/- 204.00ns ascii::bench_ascii_escape_display_no_escape 545.00ns/iter +/- 6.00ns new: ascii::bench_ascii_escape_display_mixed 4.99µs/iter +/- 56.00ns ascii::bench_ascii_escape_display_no_escape 91.00ns/iter +/- 1.00ns ```	2024-01-20 09:37:25 +01:00
Nicholas Thompson	c65c35b3ef	Reduced amount of int_pow benches Also simplified the macros	2024-01-11 14:00:01 -05:00
Nicholas Thompson	7dcce97686	Edited int_pow micro-benchmarks	2024-01-11 11:30:12 -05:00
Nicholas Thompson	33a47df84a	Added int_pow micro-benchmarks	2024-01-11 11:30:12 -05:00
The8472	3aa73135cf	bench trustedrandomaccess specialization in zip	2024-01-10 18:59:44 +01:00
Jake Goulding	5772818dc8	Adjust library tests for unused_tuple_struct_fields -> dead_code	2024-01-02 15:34:37 -05:00
surechen	40ae34194c	remove redundant imports detects redundant imports that can be eliminated. for #117772 : In order to facilitate review and modification, split the checking code and removing redundant imports code into two PR.	2023-12-10 10:56:22 +08:00
The 8472	3f55e8665c	benchmarks for Chars::advance_by	2023-11-27 22:06:35 +01:00
Federico Stra	0f9a4d9ab4	checked_ilog: add benchmarks	2023-09-22 16:03:14 +02:00
Deadbeef	626efab67f	fix	2023-07-23 09:58:31 +00:00
The 8472	6c87448b57	optimize Cstr/EscapeAscii display old: ascii::bench_ascii_escape_display_mixed 17.97µs/iter +/- 204.00ns ascii::bench_ascii_escape_display_no_escape 545.00ns/iter +/- 6.00ns new: ascii::bench_ascii_escape_display_mixed 4.99µs/iter +/- 56.00ns ascii::bench_ascii_escape_display_no_escape 91.00ns/iter +/- 1.00ns	2023-06-29 01:55:03 +02:00
The 8472	070ce235f2	Specialize StepBy<Range<{integer}>> For ranges < usize we determine the number of items StepBy would yield and then store that in the range.end instead of the actual end. This significantly simplifies calculation of the loop induction variable especially in cases where StepBy::step (an usize) could overflow the Range's item type	2023-06-23 00:17:34 +02:00
The 8472	cfb0f11a9f	add benchmark	2023-06-12 13:03:29 +02:00
The 8472	b40896d17b	optimize next_chunk impls for Filter and FilterMap	2023-05-20 11:29:34 +02:00
Matthias Krüger	fc30207b16	Rollup merge of #108291 - chenyukang:yukang/fix-benchmarks, r=workingjubilee Fix more benchmark test with black_box Follow up fix for https://github.com/rust-lang/rust/issues/107590	2023-05-15 17:12:43 +02:00
mazong1123	b0a85d614d	Add shortcut for Grisu3 algorithm. Check requested digit length and the fractional or integral parts of the number. Falls back earlier without trying the Grisu algorithm if the specific condition meets. Fix #110129	2023-04-25 11:34:57 +08:00
bors	816f958ac3	Auto merge of #108157 - scottmcm:tuple-gt-via-partialcmp, r=dtolnay Use `partial_cmp` to implement tuple `lt`/`le`/`ge`/`gt` In today's implementation, `(A, B)::gt` contains calls to both `A::eq` and `A::gt`. That's fine for primitives, but for things like `String`s it's kinda weird -- `(String, usize)::gt` has a call to both `bcmp` and `memcmp` (<https://rust.godbolt.org/z/7jbbPMesf>) because when `bcmp` says the `String`s aren't equal, it turns around and calls `memcmp` to find out which one's bigger. This PR changes the implementation to instead implement `(A, …, C, Z)::gt` using `A::partial_cmp`, `…::partial_cmp`, `C::partial_cmp`, and `Z::gt`. (And analogously for `lt`, `le`, and `ge`.) That way expensive comparisons don't need to be repeated. Technically this is an observable change on stable, so I've marked it `needs-fcp` + `T-libs-api` and will r? rust-lang/libs-api I'm hoping that this will be non-controversial, however, since it's very similar to the observable changes that were made to the derives (#81384 #98655) -- like those, this only changes behaviour if a type overrode behaviour in a way inconsistent with the rules for the various traits involved. (The first commit here is #108156, adding the codegen test, which I used to make sure this doesn't regress behaviour for primitives.) Zulip conversation about this change: <https://rust-lang.zulipchat.com/#narrow/stream/219381-t-libs/topic/.60.3E.60.20on.20Tuples/near/328392927>.	2023-03-05 22:02:26 +00:00
yukang	62cfd8a123	fix more benchmark test with black_box	2023-02-21 03:26:40 +00:00
Scott McMurray	4492793e0d	Add a slightly-contrived tuple comparison benchmark	2023-02-17 11:46:19 -08:00
kadmin	826abcc728	Shrink size of array benchmarks	2023-02-14 05:01:24 +00:00
kadmin	cbd1b81bd2	Add array::map benchmarks	2023-02-11 04:23:53 +00:00
yukang	fe84cecf60	fix #107590 , Fix benchmarks in library/core with black_box	2023-02-03 00:33:36 +08:00
Thom Chiovoloni	a4bf36e87b	Update rand in the stdlib tests, and remove the getrandom feature from it	2023-01-04 14:52:41 -08:00
Dylan DPC	1db7f690b1	Rollup merge of #103570 - lukas-code:stabilize-ilog, r=scottmcm Stabilize integer logarithms Stabilizes feature `int_log`. I've also made the functions const stable, because they don't depend on any unstable const features. `rustc_allow_const_fn_unstable` is just there for `Option::expect`, which could be replaced with a `match` and `panic!`. cc ``@rust-lang/wg-const-eval`` closes https://github.com/rust-lang/rust/issues/70887 (tracking issue) ~~blocked on FCP finishing: https://github.com/rust-lang/rust/issues/70887#issuecomment-1289028216~~ FCP finished: https://github.com/rust-lang/rust/issues/70887#issuecomment-1302121266	2022-11-09 19:21:21 +05:30
The 8472	b00666ed09	add benchmark for iter::ArrayChunks::fold specialization This also updates the existing iter::Copied::next_chunk benchmark so that the thing it benches doesn't get masked by the ArrayChunks specialization	2022-11-07 21:44:24 +01:00
Lukas Markeffsky	9e36fd926c	stabilize `int_log`	2022-10-26 11:58:33 +02:00
The 8472	963d6f757c	add a benchmark for slice_iter.copied().array_chunks()	2022-10-17 23:40:21 +02:00
Tim Vermeulen	db2b4a3a7e	Use internal iteration in `Iterator::{cmp_by, partial_cmp_by, eq_by}`	2022-08-21 12:23:10 +02:00
Eric Holk	c18f22058b	Rename integer log* methods to ilog* This reflects the concensus from the libs team as reported at https://github.com/rust-lang/rust/issues/70887#issuecomment-1209513261 Co-authored-by: Yosh Wuyts <github@yosh.is>	2022-08-09 10:20:49 -07:00
Nilstrieb	3358a41acb	Add unicode fast path to `is_printable` Before, it would enter the full expensive check even for normal ascii characters. Now, it skips the check for the ascii characters in `32..127`. This range was checked manually from the current behavior.	2022-05-31 10:51:35 +02:00
bors	12d3f107c1	Auto merge of #96626 - thomcc:rand-bump, r=m-ou-se Avoid using `rand::thread_rng` in the stdlib benchmarks. This is kind of an anti-pattern because it introduces extra nondeterminism for no real reason. In thread_rng's case this comes both from the random seed and also from the reseeding operations it does, which occasionally does syscalls (which adds additional nondeterminism). The impact of this would be pretty small in most cases, but it's a good practice to avoid (particularly because avoiding it was not hard). Anyway, several of our benchmarks already did the right thing here anyway, so the change was pretty easy and mostly just applying it more universally. That said, the stdlib benchmarks aren't particularly stable (nor is our benchmark framework particularly great), so arguably this doesn't matter that much in practice. ~~Anyway, this also bumps the `rand` dev-dependency to 0.8, since it had fallen somewhat out of date.~~ Nevermind, too much of a headache.	2022-05-05 05:08:44 +00:00
The 8472	e3db41bf97	add benchmark	2022-05-02 20:54:46 +02:00
Thom Chiovoloni	0812759840	Avoid use of `rand::thread_rng` in stdlib benchmarks	2022-05-02 00:08:21 -07:00
Eduardo Sánchez Muñoz	93ae6f80e3	Make some `usize`-typed masks definition agnostic to the size of `usize` Some masks where defined as ```rust const NONASCII_MASK: usize = 0x80808080_80808080u64 as usize; ``` where it was assumed that `usize` is never wider than 64, which is currently true. To make those constants valid in a hypothetical 128-bit target, these constants have been redefined in an `usize`-width-agnostic way ```rust const NONASCII_MASK: usize = usize::from_ne_bytes([0x80; size_of::<usize>()]); ``` There are already some cases where Rust anticipates the possibility of supporting 128-bit targets, such as not implementing `From<usize>` for `u64`.	2022-04-15 17:04:59 +02:00
T-O-R-U-S	72a25d05bf	Use implicit capture syntax in format_args This updates the standard library's documentation to use the new syntax. The documentation is worthwhile to update as it should be more idiomatic (particularly for features like this, which are nice for users to get acquainted with). The general codebase is likely more hassle than benefit to update: it'll hurt git blame, and generally updates can be done by folks updating the code if (and when) that makes things more readable with the new format. A few places in the compiler and library code are updated (mostly just due to already having been done when this commit was first authored).	2022-03-10 10:23:40 -05:00
Scott McMurray	8ca47d7ae4	Stop manually SIMDing in swap_nonoverlapping Like I previously did for `reverse`, this leaves it to LLVM to pick how to vectorize it, since it can know better the chunk size to use, compared to the "32 bytes always" approach we currently have. It does still need logic to type-erase where appropriate, though, as while LLVM is now smart enough to vectorize over slices of things like `[u8; 4]`, it fails to do so over slices of `[u8; 3]`. As a bonus, this also means one no longer gets the spurious `memcpy`(s?) at the end up swapping a slice of `__m256`s: <https://rust.godbolt.org/z/joofr4v8Y>	2022-02-21 00:54:02 -08:00
Thom Chiovoloni	ebbccaf6bf	Respond to review feedback, and improve implementation somewhat	2022-02-05 11:15:18 -08:00
Thom Chiovoloni	ed01324835	Fix zh::SMALL string in core::str benchmarks	2022-02-05 11:15:17 -08:00
Thom Chiovoloni	628b217326	Optimize `core::str::Chars::count`	2022-02-05 11:15:17 -08:00
bors	ffdf18d144	Auto merge of #88788 - falk-hueffner:speedup-int-log10-branchless, r=joshtriplett Speedup int log10 branchless This is achieved with a branchless bit-twiddling implementation of the case x < 100_000, and using this as building block. Benchmark on an Intel i7-8700K (Coffee Lake): ``` name old ns/iter new ns/iter diff ns/iter diff % speedup num::int_log::u8_log10_predictable 165 169 4 2.42% x 0.98 num::int_log::u8_log10_random 438 423 -15 -3.42% x 1.04 num::int_log::u8_log10_random_small 438 423 -15 -3.42% x 1.04 num::int_log::u16_log10_predictable 633 417 -216 -34.12% x 1.52 num::int_log::u16_log10_random 908 471 -437 -48.13% x 1.93 num::int_log::u16_log10_random_small 945 471 -474 -50.16% x 2.01 num::int_log::u32_log10_predictable 1,496 1,340 -156 -10.43% x 1.12 num::int_log::u32_log10_random 1,076 873 -203 -18.87% x 1.23 num::int_log::u32_log10_random_small 1,145 874 -271 -23.67% x 1.31 num::int_log::u64_log10_predictable 4,005 3,171 -834 -20.82% x 1.26 num::int_log::u64_log10_random 1,247 1,021 -226 -18.12% x 1.22 num::int_log::u64_log10_random_small 1,265 921 -344 -27.19% x 1.37 num::int_log::u128_log10_predictable 39,667 39,579 -88 -0.22% x 1.00 num::int_log::u128_log10_random 6,456 6,696 240 3.72% x 0.96 num::int_log::u128_log10_random_small 4,108 3,903 -205 -4.99% x 1.05 ``` Benchmark on an M1 Mac Mini: ``` name old ns/iter new ns/iter diff ns/iter diff % speedup num::int_log::u8_log10_predictable 143 130 -13 -9.09% x 1.10 num::int_log::u8_log10_random 375 325 -50 -13.33% x 1.15 num::int_log::u8_log10_random_small 376 325 -51 -13.56% x 1.16 num::int_log::u16_log10_predictable 500 322 -178 -35.60% x 1.55 num::int_log::u16_log10_random 794 405 -389 -48.99% x 1.96 num::int_log::u16_log10_random_small 1,035 405 -630 -60.87% x 2.56 num::int_log::u32_log10_predictable 1,144 894 -250 -21.85% x 1.28 num::int_log::u32_log10_random 832 786 -46 -5.53% x 1.06 num::int_log::u32_log10_random_small 832 787 -45 -5.41% x 1.06 num::int_log::u64_log10_predictable 2,681 2,057 -624 -23.27% x 1.30 num::int_log::u64_log10_random 1,015 806 -209 -20.59% x 1.26 num::int_log::u64_log10_random_small 1,004 795 -209 -20.82% x 1.26 num::int_log::u128_log10_predictable 56,825 56,526 -299 -0.53% x 1.01 num::int_log::u128_log10_random 9,056 8,861 -195 -2.15% x 1.02 num::int_log::u128_log10_random_small 1,528 1,527 -1 -0.07% x 1.00 ``` The 128 bit case remains ridiculously slow because llvm fails to optimize division by a constant 128-bit value to multiplications. This could be worked around but it seems preferable to fix this in llvm. From u32 up, table lookup (like suggested [here](https://github.com/rust-lang/rust/issues/70887#issuecomment-881099813)) is still faster, but requires a hardware `leading_zeros` to be viable, and might clog up the cache.	2021-10-12 03:18:54 +00:00
The8472	4c44f061d8	benchmark for str.chars().count()	2021-09-11 00:25:41 +02:00
Falk Hüffner	57c623570a	Cosmetic fixes.	2021-09-09 20:06:46 +02:00
Falk Hüffner	0c26a3bc0c	Add benchmark for integer log10.	2021-09-06 12:19:24 +02:00
Smitty	bdfcb88e8b	Use HTTPS links where possible	2021-06-23 16:26:46 -04:00
Ralf Jung	23d54ad96f	move core::hint::black_box under its own feature gate	2021-04-25 11:08:12 +02:00

1 2

63 Commits