[librustdoc] Only split lang string on `,`, ` `, and `\t`
Split markdown lang strings into tokens on `,`.
The previous behavior was to split lang strings into tokens on any
character that wasn't a `_`, `_`, or alphanumeric.
This is a potentially breaking change, so please scrutinize! See discussion in #78344.
I noticed some test cases that made me wonder if there might have been some reason for the original behavior:
```
t("{.no_run .example}", false, true, Ignore::None, true, false, false, false, v(), None);
t("{.sh .should_panic}", true, false, Ignore::None, false, false, false, false, v(), None);
t("{.example .rust}", false, false, Ignore::None, true, false, false, false, v(), None);
t("{.test_harness .rust}", false, false, Ignore::None, true, true, false, false, v(), None);
```
It seemed pretty peculiar to specifically test lang strings in braces, with all the tokens prefixed by `.`.
I did some digging, and it looks like the test cases were added way back in [this commit from 2014](https://github.com/rust-lang/rust/commit/3fef7a74ca9a) by `@skade.`
It looks like they were added just to make sure that the splitting was permissive, and aren't testing that those strings in particular are accepted.
Closes https://github.com/rust-lang/rust/issues/78344.
Make char and u8 methods const
char methods `len_utf8`, `len_utf16`, `to_ascii_lowercase`, `eq_ignore_ascii_case` can be made const.
`u8` methods `to_ascii_lowercase`, `to_ascii_uppercase` are required to be const as well.
`u8::eq_ignore_ascii_case` was additionally made const.
Rebase of https://github.com/rust-lang/rust/pull/79549 originally authored by ``@YenForYang.`` Changes from that PR:
- Squashed all commits from #79549.
- rebased to latest upstream master.
- Removed const attributes for `char::escape_unicode` and `char::escape_default`.
- Updated `since` attributes for `const` stabilization to 1.52.0.
cc ``@m-ou-se.``
Make ptr::write const
~~The code in this PR as of right now is not much more than an experiment.~~
~~This should, if I am not mistaken, in theory compile and pass the tests once the bootstraping compiler is updated. Thus the PR is blocked on that which should happen some time after the February the 9th. Also we might want to wait for #79989 to avoid regressing performance due to using `mem::forget` over `intrinsics::forget`~~.
Convert core/num/mod.rs to intra-doc links
Helps with #75080.
This can't convert the associated constants `MAX` and `MIN` until #74489 is merged.
r? `@poliorcetics`
Expand FlattenCompat folds
The former `chain`+`chain`+`fold` implementation looked nice from a
functional-programming perspective, but it introduced unnecessary layers
of abstraction on every `flat_map`/`flatten` fold. It's straightforward
to just fold each part in turn, and this makes it look like a simplified
version of the existing `try_fold` implementation.
For the `iter::bench_flat_map*` benchmarks, I get a large improvement in
`bench_flat_map_chain_sum`, from 1,598,473 ns/iter to 499,889 ns/iter,
and the rest are unchanged.
`escape_unicode`, `escape_default`, `len_utf8`, `len_utf16`, to_ascii_lowercase`, `eq_ignore_ascii_case`
`u8` methods `to_ascii_lowercase`, `to_ascii_uppercase` also must be made const
u8 methods made const
Update methods.rs
Update mod.rs
Update methods.rs
Fix `since` in rustc_const_stable to next stable
Fix `since` in rustc_const_stable to next stable
Update methods.rs
Update mod.rs
Update the bootstrap compiler
This updates the bootstrap compiler, notably leaving out a change to enable semicolon in macro expressions lint, because stdarch still depends on the old behavior.
Slight perf improvement on char::to_ascii_lowercase
`char::to_ascii_lowercase()` was checking if it was ascii and then if it was in the right range. Instead propose to check once (I think removing a compare and a shift in the process: [godbolt](https://godbolt.org/z/e5Tora) ).
before:
```
test char::methods::bench_to_ascii_lowercase ... bench: 11,196 ns/iter (+/- 632)
test char::methods::bench_to_ascii_uppercase ... bench: 11,656 ns/iter (+/- 671)
```
after:
```
test char::methods::bench_to_ascii_lowercase ... bench: 9,612 ns/iter (+/- 979)
test char::methods::bench_to_ascii_uppercase ... bench: 8,241 ns/iter (+/- 701)
```
(calling u8::to_ascii_lowercase and letting that flip the 5th bit is also an option, but it's more instructions. I'm thinking for things around ascii and char we want to be as efficient as possible.)
Improve design of `assert_len`
It was discussed in the [tracking issue](https://github.com/rust-lang/rust/issues/76393#issuecomment-761765448) that `assert_len`'s name and usage are confusing. This PR improves them based on a suggestion by ``@scottmcm`` in that issue.
I also improved the documentation to make it clearer when you might want to use this method.
Old example:
```rust
let range = range.assert_len(slice.len());
```
New example:
```rust
let range = range.ensure_subset_of(..slice.len());
```
Fixes#81157
improve UnsafeCell docs
Sometimes [questions like this come up](https://rust-lang.zulipchat.com/#narrow/stream/136281-t-lang.2Fwg-unsafe-code-guidelines/topic/UnsafeCells.20as.20raw.20pointers) because the UnsafeCell docs say "it's the only legal way to obtain aliasable data that is considered mutable". That is not entirely correct, since raw pointers also provide that option. So I propose we focus the docs on the interaction of `UnsafeCell` and *shared references* specifically, which is really where they are needed.
Add internal `collect_into_array[_unchecked]` to remove duplicate code
Unlike the similar PRs #69985, #75644 and #79659, this PR only adds private functions and does not propose any new public API. The change is just for the purpose of avoiding duplicate code.
Many array methods already contained the same kind of code and there are still many array related methods to come (e.g. `Iterator::{chunks, map_windows, next_n, ...}`, `[T; N]::{cloned, copied, ...}`, ...) which all basically need this functionality. Writing custom `unsafe` code for each of those doesn't seem like a good idea. I added two functions in this PR (and not just the `unsafe` version) because I already know that I need the `Option`-returning version for `Iterator::map_windows`.
This is closely related to https://github.com/rust-lang/rust/issues/81615. I think that all options listed in that issue can be implemented using the function added in this PR. The only instance where `collect_array_into` might not be general enough is when the caller want to handle incomplete arrays manually. Currently, if `iter` yields fewer than `N` items, `None` is returned and the already yielded items are dropped. But as this is just a private function, it can be made more general in future PRs.
And while this was not the goal, this seems to lead to better assembly for `array::map`: https://rust.godbolt.org/z/75qKTa (CC ``@JulianKnodt)``
Let me know what you think :)
CC ``@matklad`` ``@bstrie``
Improve assert_eq! and assert_ne!
This PR improves `assert_eq!` and `assert_ne!` by moving the panicking code in an external function.
It does not change the fast path, but the move of the formatting in the cold path (the panic) may have a positive effect on in instruction cache use and with inlining.
Moreover, the use of trait objects instead of generic may improve compile times for `assert_eq!`-heavy code.
Godbolt link: ~~https://rust.godbolt.org/z/TYa9MT~~ \
Updated: https://rust.godbolt.org/z/bzE84x
simplify eat_digits
Simplify eat_digits by checking values in iterator, plus decrease function size, by returning unchecked slices.
https://godbolt.org/z/cxjav4
Add tests for Atomic*::fetch_{min,max}
This ensures that all atomic operations except for fences are tested. This has been useful to test my work on using atomic instructions for atomic operations in cg_clif instead of a global lock.
Ensure valid TraitRefs are created for GATs
This fixes `ProjectionTy::trait_ref` to use the correct substs. Places that need all of the substs have been updated to not use `trait_ref`.
r? ````@jackh726````
Implement RFC 2580: Pointer metadata & VTable
RFC: https://github.com/rust-lang/rfcs/pull/2580
~~Before merging this PR:~~
* [x] Wait for the end of the RFC’s [FCP to merge](https://github.com/rust-lang/rfcs/pull/2580#issuecomment-759145278).
* [x] Open a tracking issue: https://github.com/rust-lang/rust/issues/81513
* [x] Update `#[unstable]` attributes in the PR with the tracking issue number
----
This PR extends the language with a new lang item for the `Pointee` trait which is special-cased in trait resolution to implement it for all types. Even in generic contexts, parameters can be assumed to implement it without a corresponding bound.
For this I mostly imitated what the compiler was already doing for the `DiscriminantKind` trait. I’m very unfamiliar with compiler internals, so careful review is appreciated.
This PR also extends the standard library with new unstable APIs in `core::ptr` and `std::ptr`:
```rust
pub trait Pointee {
/// One of `()`, `usize`, or `DynMetadata<dyn SomeTrait>`
type Metadata: Copy + Send + Sync + Ord + Hash + Unpin;
}
pub trait Thin = Pointee<Metadata = ()>;
pub const fn metadata<T: ?Sized>(ptr: *const T) -> <T as Pointee>::Metadata {}
pub const fn from_raw_parts<T: ?Sized>(*const (), <T as Pointee>::Metadata) -> *const T {}
pub const fn from_raw_parts_mut<T: ?Sized>(*mut (),<T as Pointee>::Metadata) -> *mut T {}
impl<T: ?Sized> NonNull<T> {
pub const fn from_raw_parts(NonNull<()>, <T as Pointee>::Metadata) -> NonNull<T> {}
/// Convenience for `(ptr.cast(), metadata(ptr))`
pub const fn to_raw_parts(self) -> (NonNull<()>, <T as Pointee>::Metadata) {}
}
impl<T: ?Sized> *const T {
pub const fn to_raw_parts(self) -> (*const (), <T as Pointee>::Metadata) {}
}
impl<T: ?Sized> *mut T {
pub const fn to_raw_parts(self) -> (*mut (), <T as Pointee>::Metadata) {}
}
/// `<dyn SomeTrait as Pointee>::Metadata == DynMetadata<dyn SomeTrait>`
pub struct DynMetadata<Dyn: ?Sized> {
// Private pointer to vtable
}
impl<Dyn: ?Sized> DynMetadata<Dyn> {
pub fn size_of(self) -> usize {}
pub fn align_of(self) -> usize {}
pub fn layout(self) -> crate::alloc::Layout {}
}
unsafe impl<Dyn: ?Sized> Send for DynMetadata<Dyn> {}
unsafe impl<Dyn: ?Sized> Sync for DynMetadata<Dyn> {}
impl<Dyn: ?Sized> Debug for DynMetadata<Dyn> {}
impl<Dyn: ?Sized> Unpin for DynMetadata<Dyn> {}
impl<Dyn: ?Sized> Copy for DynMetadata<Dyn> {}
impl<Dyn: ?Sized> Clone for DynMetadata<Dyn> {}
impl<Dyn: ?Sized> Eq for DynMetadata<Dyn> {}
impl<Dyn: ?Sized> PartialEq for DynMetadata<Dyn> {}
impl<Dyn: ?Sized> Ord for DynMetadata<Dyn> {}
impl<Dyn: ?Sized> PartialOrd for DynMetadata<Dyn> {}
impl<Dyn: ?Sized> Hash for DynMetadata<Dyn> {}
```
API differences from the RFC, in areas noted as unresolved questions in the RFC:
* Module-level functions instead of associated `from_raw_parts` functions on `*const T` and `*mut T`, following the precedent of `null`, `slice_from_raw_parts`, etc.
* Added `to_raw_parts`
Add a `Result::into_ok_or_err` method to extract a `T` from `Result<T, T>`
When updating code to handle the semi-recent deprecation of `compare_and_swap` in favor of `compare_exchange`, which returns `Result<T, T>`, I wanted this. I've also wanted it with code using `slice::binary_search` before.
The name (and perhaps the documentation) is the hardest part here, but this name seems consistent with the other Result methods, and equivalently memorable.
Document that `assert!` format arguments are evaluated lazily
It can be useful to do some computation in `assert!` format arguments, in order to get better error messages. For example:
```rust
assert!(
some_condition,
"The state is invalid. Details: {}",
expensive_call_to_get_debugging_info(),
);
```
It seems like `assert!` only evaluates the format arguments if the assertion fails, which is useful but doesn't appear to be documented anywhere. This PR documents the behavior and adds some tests.
To digit simplification
I found out the other day that all the ascii digits have the first four bits as one would hope them to. (Eg. char `2` ends `0b0010`). There are two bits to indicate it's in the digit range ( `0b0011_0000`). If it is a true digit then all the higher bits aside from these two will be 0 (as ascii is the lowest part of the unicode u32 spectrum). So XORing with `0b11_0000` should mean we either get the number 0-9 or alternativly we get a larger number in the u32 space. If we get something that's not 0-9 then it will be discarded as it will be greater than the radix.
The code seems so fast though that there's quite a lot of noise in the benchmarks so it's not that easy to prove conclusively that it's faster as well as less instructions.
The non-fast path I was toying with as well wondering if we could do this as then we'd only have one return and less instructions still:
```
match self {
'a'..='z' => self as u32 - 'a' as u32 + 10,
'A'..='Z' => self as u32 - 'A' as u32 + 10,
_ => { radix = 10; self as u32 ^ ASCII_DIGIT_MASK},
}
```
Here's the [godbolt](https://godbolt.org/z/883c9n).
( H/T to ``@byteshadow`` for pointing out xor was what I needed)
It can be useful to do some computation in `assert!` format arguments, in order to get better error messages. For example:
```rust
assert!(
some_condition,
"The state is invalid. Details: {}",
expensive_call_to_get_debugging_info(),
);
```
It seems like `assert!` only evaluates the format arguments if the assertion fails, which is useful but doesn't appear to be documented anywhere. This PR documents the behavior and adds some tests.
The former `chain`+`chain`+`fold` implementation looked nice from a
functional-programming perspective, but it introduced unnecessary layers
of abstraction on every `flat_map`/`flatten` fold. It's straightforward
to just fold each part in turn, and this makes it look like a simplified
version of the existing `try_fold` implementation.
For the `iter::bench_flat_map*` benchmarks, I get a large improvement in
`bench_flat_map_chain_sum`, from 1,598,473 ns/iter to 499,889 ns/iter,
and the rest are unchanged.