rust/library/core/benches
bors caca2121ff Auto merge of #74024 - Folyd:master, r=m-ou-se
Improve slice.binary_search_by()'s best-case performance to O(1)

This PR aimed to improve the [slice.binary_search_by()](https://doc.rust-lang.org/std/primitive.slice.html#method.binary_search_by)'s best-case performance to O(1).

# Noticed

I don't know why the docs of `binary_search_by` said `"If there are multiple matches, then any one of the matches could be returned."`, but the implementation isn't the same thing. Actually, it returns the **last one** if multiple matches found.

Then we got two options:

## If returns the last one is the correct or desired result

Then I can rectify the docs and revert my changes.

## If the docs are correct or desired result

Then my changes can be merged after fully reviewed.

However, if my PR gets merged, another issue raised: this could be a **breaking change** since if multiple matches found, the returning order no longer the last one instead of it could be any one.

For example:
```rust
let mut s = vec![0, 1, 1, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55];
let num = 1;
let idx = s.binary_search(&num);
s.insert(idx, 2);

// Old implementations
assert_eq!(s, [0, 1, 1, 1, 1, 2, 2, 3, 5, 8, 13, 21, 34, 42, 55]);

// New implementations
assert_eq!(s, [0, 1, 1, 1, 2, 1, 2, 3, 5, 8, 13, 21, 34, 42, 55]);
```

# Benchmarking

**Old implementations**
```sh
$ ./x.py bench --stage 1 library/libcore
test slice::binary_search_l1           ... bench:          59 ns/iter (+/- 4)
test slice::binary_search_l1_with_dups ... bench:          59 ns/iter (+/- 3)
test slice::binary_search_l2           ... bench:          76 ns/iter (+/- 5)
test slice::binary_search_l2_with_dups ... bench:          77 ns/iter (+/- 17)
test slice::binary_search_l3           ... bench:         183 ns/iter (+/- 23)
test slice::binary_search_l3_with_dups ... bench:         185 ns/iter (+/- 19)
```

**New implementations (1)**

Implemented by this PR.
```rust
if cmp == Equal {
    return Ok(mid);
} else if cmp == Less {
    base = mid
}
```
```sh
$ ./x.py bench --stage 1 library/libcore
test slice::binary_search_l1           ... bench:          58 ns/iter (+/- 2)
test slice::binary_search_l1_with_dups ... bench:          37 ns/iter (+/- 4)
test slice::binary_search_l2           ... bench:          76 ns/iter (+/- 3)
test slice::binary_search_l2_with_dups ... bench:          57 ns/iter (+/- 6)
test slice::binary_search_l3           ... bench:         200 ns/iter (+/- 30)
test slice::binary_search_l3_with_dups ... bench:         157 ns/iter (+/- 6)

$ ./x.py bench --stage 1 library/libcore
test slice::binary_search_l1           ... bench:          59 ns/iter (+/- 8)
test slice::binary_search_l1_with_dups ... bench:          37 ns/iter (+/- 2)
test slice::binary_search_l2           ... bench:          77 ns/iter (+/- 2)
test slice::binary_search_l2_with_dups ... bench:          57 ns/iter (+/- 2)
test slice::binary_search_l3           ... bench:         198 ns/iter (+/- 21)
test slice::binary_search_l3_with_dups ... bench:         158 ns/iter (+/- 11)

```

**New implementations (2)**

Suggested by `@nbdd0121` in [comment](https://github.com/rust-lang/rust/pull/74024#issuecomment-665430239).
```rust
base = if cmp == Greater { base } else { mid };
if cmp == Equal { break }
```

```sh
$ ./x.py bench --stage 1 library/libcore
test slice::binary_search_l1           ... bench:          59 ns/iter (+/- 7)
test slice::binary_search_l1_with_dups ... bench:          37 ns/iter (+/- 5)
test slice::binary_search_l2           ... bench:          75 ns/iter (+/- 3)
test slice::binary_search_l2_with_dups ... bench:          56 ns/iter (+/- 3)
test slice::binary_search_l3           ... bench:         195 ns/iter (+/- 15)
test slice::binary_search_l3_with_dups ... bench:         151 ns/iter (+/- 7)

$ ./x.py bench --stage 1 library/libcore
test slice::binary_search_l1           ... bench:          57 ns/iter (+/- 2)
test slice::binary_search_l1_with_dups ... bench:          38 ns/iter (+/- 2)
test slice::binary_search_l2           ... bench:          77 ns/iter (+/- 11)
test slice::binary_search_l2_with_dups ... bench:          57 ns/iter (+/- 4)
test slice::binary_search_l3           ... bench:         194 ns/iter (+/- 15)
test slice::binary_search_l3_with_dups ... bench:         151 ns/iter (+/- 18)

```

I run some benchmarking testings against on two implementations. The new implementation has a lot of improvement in duplicates cases, while in `binary_search_l3` case, it's a little bit slower than the old one.
2021-03-05 20:12:13 +00:00
..
ascii mv std libs to library/ 2020-07-27 19:51:13 -05:00
char Slight perf improvement on char::to_ascii_lowercase 2021-02-06 19:56:43 +00:00
hash mv std libs to library/ 2020-07-27 19:51:13 -05:00
num flt2dec: properly handle uninitialized memory 2020-09-02 12:41:38 +02:00
any.rs mv std libs to library/ 2020-07-27 19:51:13 -05:00
ascii.rs Unify way to flip 6th bit. (Same assembly generated) 2021-02-08 12:21:36 +00:00
fmt.rs Use more efficient scheme for display u128/i128 2020-09-28 20:38:38 +00:00
iter.rs Add more benchmarks 2021-01-08 09:50:35 +00:00
lib.rs mv std libs to library/ 2020-07-27 19:51:13 -05:00
ops.rs mv std libs to library/ 2020-07-27 19:51:13 -05:00
pattern.rs mv std libs to library/ 2020-07-27 19:51:13 -05:00
slice.rs Improve slice.binary_search_by()'s best-case performance to O(1) 2021-01-30 14:11:47 +08:00