rustdoc-search: add impl disambiguator to duplicate assoc items
Preview (to see the difference, click the link and pay attention to the specific function that comes up):
| Before | After |
|--|--|
| [`simd<i64>, simd<i64> -> simd<i64>`](https://doc.rust-lang.org/nightly/std/?search=simd%3Ci64%3E%2C%20simd%3Ci64%3E%20-%3E%20simd%3Ci64%3E) | [`simd<i64>, simd<i64> -> simd<i64>`](https://notriddle.com/rustdoc-demo-html-3/impl-disambiguate-search/std/index.html?search=simd%3Ci64%3E%2C%20simd%3Ci64%3E%20-%3E%20simd%3Ci64%3E) |
| [`cow, vec -> bool`](https://doc.rust-lang.org/nightly/std/?search=cow%2C%20vec%20-%3E%20bool) | [`cow, vec -> bool`](https://notriddle.com/rustdoc-demo-html-3/impl-disambiguate-search/std/index.html?search=cow%2C%20vec%20-%3E%20bool)
Helps with #90929
This changes the search results, specifically, when there's more than one impl with an associated item with the same name. For example, the search queries `simd<i8> -> simd<i8>` and `simd<i64> -> simd<i64>` don't link to the same function, but most of the functions have the same names.
This change should probably be FCP-ed, especially since it adds a new anchor link format for `main.js` to handle, so that URLs like `struct.Vec.html#impl-AsMut<[T]>-for-Vec<T,+A>/method.as_mut` redirect to `struct.Vec.html#method.as_mut-2`. It's a strange design, but there are a few reasons for it:
* I'd like to avoid making the HTML bigger. Obviously, fixing this bug is going to add at least a little more data to the search index, but adding more HTML penalises viewers for the benefit of searchers.
* Breaking `struct.Vec.html#method.len` would also be a disappointment.
On the other hand:
* The path-style anchors might be less prone to link rot than the numbered anchors. It's definitely less likely to have URLs that appear to "work", but silently point at the wrong thing.
* This commit arranges the path-style anchor to redirect to the numbered anchor. Nothing stops rustdoc from doing the opposite, making path-style anchors the default and redirecting the "legacy" numbered ones.
### The bug
On the "Before" links, this example search calls for `i64`:
![image](https://github.com/rust-lang/rust/assets/1593513/9431d89d-41dc-4f68-bbb1-3e2704a973d2)
But if I click any of the results, I get `f64` instead.
![image](https://github.com/rust-lang/rust/assets/1593513/6d89c692-1847-421a-84d9-22e359d9cf82)
The PR fixes this problem by adding enough information to the search result `href` to disambiguate methods with different types but the same name.
More detailed description of the problem at:
https://github.com/rust-lang/rust/pull/109422#issuecomment-1491089293
> When a struct/enum/union has multiple impls with different type parameters, it can have multiple methods that have the same name, but which are on different impls. Besides Simd, [Any](https://doc.rust-lang.org/nightly/std/any/trait.Any.html?search=any%3A%3Adowncast) also demonstrates this pattern. It has three methods named `downcast`, on three different impls.
>
> When that happens, it presents a challenge in linking to the method. Normally we link like `#method.foo`. When there are multiple `foo`, we number them like `#method.foo`, `#method.foo-1`, `#method.foo-2`, etc.
>
> It also presents a challenge for our search code. Currently we store all the variants in the index, but don’t have any way to generate unambiguous URLs in the results page, or to distinguish them in the SERP.
>
> To fix this, we need three things:
>
> 1. A fragment format that fully specifies the impl type parameters when needed to disambiguate (`#impl-SimdOrd-for-Simd<i64,+LANES>/method.simd_max`)
> 2. A search index that stores methods with enough information to disambiguate the impl they were on.
> 3. A search results interface that can display multiple methods on the same type with the same name, when appropriate OR a disambiguation landing section on item pages?
>
> For reviewers: it can be hard to see the new fragment format in action since it immediately gets rewritten to the numbered form.
[rustdoc] Show enum discrimant if it is a C-like variant
Fixes https://github.com/rust-lang/rust/issues/101337.
We currently display values for associated constant items in traits:
![image](https://github.com/rust-lang/rust/assets/3050060/03e566ec-c670-47b4-8ca2-b982baa7a0f4)
And we also display constant values like [here](file:///home/imperio/rust/rust/build/x86_64-unknown-linux-gnu/doc/std/f32/consts/constant.E.html).
I think that for coherency, we should display values of C-like enum variants.
With this change, it looks like this:
![image](https://github.com/rust-lang/rust/assets/3050060/b53fbbe0-bdb1-4289-8537-f2dd4988e9ac)
As for the display of the constant value itself, I used what we already have to keep coherency.
We display the C-like variants value in the following scenario:
1. It is a C-like variant with a value set => all the time
2. It is a C-like variant without a value set: All other variants are C-like variants and at least one them has its value set.
Here is the result in code:
```rust
// Ax and Bx value will be displayed.
enum A {
Ax = 12,
Bx,
}
// Ax and Bx value will not be displayed
enum B {
Ax,
Bx,
}
// Bx value will not be displayed
enum C {
Ax(u32),
Bx,
}
// Bx value will not be displayed, Cx value will be displayed.
#[repr(u32)]
enum D {
Ax(u32),
Bx,
Cx = 12,
}
```
r? `@notriddle`
This should result in a layout for the actual standard library,
when built on CI, that looks like this:
_____
/ \ std
| R | 1.74.0-nightly
\_____/
(203c57dbe 2023-09-17)
Having the whole version as one string caused it to flex wrap,
because the sidebar isn't wide enough to fit the whole thing.
This commit changes the layout to something a bit less "look at my logo!!!111"
gigantic, and makes it clearer where clicking the logo will actually take you.
It also means the crate name is persistently at the top of the sidebar, even
when in a sub-item page, and clicking that name takes you back to the root.
| | Short crate name | Long crate name |
|---------|------------------|-----------------|
| Root | ![short-root] | ![long-root]
| Subpage | ![short-subpage] | ![long-subpage]
[short-root]: https://github.com/rust-lang/rust/assets/1593513/fe2ce102-d4b8-44e6-9f7b-68636a907f56
[short-subpage]: https://github.com/rust-lang/rust/assets/1593513/29501663-56c0-4151-b7de-d2637e167125
[long-root]: https://github.com/rust-lang/rust/assets/1593513/f6a385c0-b4c5-4a9c-954b-21b38de4192f
[long-subpage]: https://github.com/rust-lang/rust/assets/1593513/97ec47b4-61bf-4ebe-b461-0d2187b8c6cahttps://notriddle.com/rustdoc-html-demo-4/logo-lockup/image/index.htmlhttps://notriddle.com/rustdoc-html-demo-4/logo-lockup/crossbeam_channel/index.htmlhttps://notriddle.com/rustdoc-html-demo-4/logo-lockup/adler/struct.Adler32.htmlhttps://notriddle.com/rustdoc-html-demo-4/logo-lockup/crossbeam_channel/struct.Sender.html
This improves visual information density (the construct with the logo and
crate name is *shorter* than the logo on its own, because it's not
square) and navigation clarity (we can now see what clicking the Rust logo
does, specifically).
Compare this with the layout at [Phoenix's Hexdocs] (which is what this
proposal is closely based on), the old proposal on [Internals Discourse]
(which always says "Rust standard library" in the sidebar, but doesn't do the
side-by-side layout).
[Phoenix's Hexdocs]: https://hexdocs.pm/phoenix/1.7.7/overview.html
[Internals Discourse]: https://internals.rust-lang.org/t/poc-of-a-new-design-for-the-generated-rustdoc/11018
In newer versions of rustdoc, the crate name and version are always shown in
the sidebar, even in subpages. Clicking the crate name does the same thing
clicking the logo always did: return you to the crate root.
While this actually takes up less screen real estate than the old layout on
desktop, it takes up more HTML. It's also a bit more visually complex.
I could do what the Internals POC did and keep the vertically stacked layout
all the time, instead of doing a horizontal stack where possible. It would
take up more screen real estate, though.
This design is lifted almost verbatim from Hexdocs. It seems to work for them.
[`opentelemetry_process_propagator`], for example, has a long application name.
[`opentelemetry_process_propagator`]: https://hexdocs.pm/opentelemetry_process_propagator/OpentelemetryProcessPropagator.html
Has anyone written the rationale on why the Rust logo shows up on projects that
aren't the standard library? If we turned it off on non-standard crates by
default, it would line wrap crate names a lot less often.
Or maybe we should encourage crate authors to include their own logo more
often? It certainly helps give people a better sense of "place."
I'm not sure of anything that directly follows up this one. Plenty of other
changes could be made to improve the layout, like
* coming up with a less cluttered way to do disclosure (there's a lot of `[-]`
on the page)
* doing a better job of separating lateral navigation (vec::Vec links to
vec::IntoIter) and the table of contents (vec::Vec links to vec::Vec::new)
* giving readers more control of how much rustdoc hows them, and giving doc
authors more control of how much it generates
* better search that reduces the need to browse
But those are mostly orthogonal, not future possibilities unlocked by this change.
rustdoc: fix & clean up handling of cross-crate higher-ranked parameters
Preparatory work for the refactoring planned in #113015 (for correctness & maintainability).
---
1. Render the higher-ranked parameters of cross-crate function pointer types **(*)**.
2. Replace occurrences of `collect_referenced_late_bound_regions()` (CRLBR) with `bound_vars()`.
The former is quite problematic and the use of the latter allows us to yank a lot of hacky code **(†)**
as you can tell from the diff! :)
3. Add support for cross-crate higher-ranked types (`#![feature(non_lifetime_binders)]`).
We were previously ICE'ing on them (see `inline_cross/non_lifetime_binders.rs`).
---
**(*)**: Extracted from test `inline_cross/fn-type.rs`:
```diff
- fn(_: &'z fn(_: &'b str), _: &'a ()) -> &'a ()
+ for<'z, 'a, '_unused> fn(_: &'z for<'b> fn(_: &'b str), _: &'a ()) -> &'a ()
```
**(†)**: It returns an `FxHashSet` which isn't *predictable* or *stable* wrt. source code (`.rmeta`) changes. To elaborate, the ordering of late-bound regions doesn't necessarily reflect the ordering found in the source code. It does seem to be stable across compilations but modifying the source code of the to-be-documented crates (like adding or renaming items) may result in a different order:
<details><summary>Example</summary>
Let's assume that we're documenting the cross-crate re-export of `produce` from the code below. On `master`, rustdoc would render the list of binders as `for<'x, 'y, 'z>`. However, once you add back the functions `a`–`l`, it would be rendered as `for<'z, 'y, 'x>` (reverse order)! Results may vary. `bound_vars()` fixes this as it returns them in source order.
```rs
// pub fn a() {}
// pub fn b() {}
// pub fn c() {}
// pub fn d() {}
// pub fn e() {}
// pub fn f() {}
// pub fn g() {}
// pub fn h() {}
// pub fn i() {}
// pub fn j() {}
// pub fn k() {}
// pub fn l() {}
pub fn produce() -> impl for<'x, 'y, 'z> Trait<'z, 'y, 'x> {}
pub trait Trait<'a, 'b, 'c> {}
impl Trait<'_, '_, '_> for () {}
```
</details>
Further, as the name suggests, CRLBR only collects *referenced* regions and thus we drop unused binders. `bound_vars()` contains unused binders on the other hand. Let's stay closer to the source where possible and keep unused binders.
Lastly, using `bound_vars()` allows us to get rid of
* the deduplication and alphabetical sorting hack in `simplify.rs`
* the weird field `bound_params` on `EqPredicate`
both of which were introduced by me in #102707 back when I didn't know better.
To illustrate, let's look at the cross-crate bound `T: for<'a, 'b> Trait<A<'a> = (), B<'b> = ()>`.
* With CRLBR + `EqPredicate.bound_params`, *before* bounds simplification we would have the bounds `T: Trait`, `for<'a> <T as Trait>::A<'a> == ()` and `for<'b> <T as Trait>::B<'b> == ()` which required us to merge `for<>`, `for<'a>` and `for<'b>` into `for<'a, 'b>` in a deterministic manner and without introducing duplicate binders.
* With `bound_vars()`, we now have the bounds `for<'a, b> T: Trait`, `<T as Trait>::A<'a> == ()` and `<T as Trait>::B<'b> == ()` before bound simplification similar to rustc itself. This obviously no longer requires any funny merging of `for<>`s. On top of that `for<'a, 'b>` is guaranteed to be in source order.
This whole thing changes it so that the JS and the UI both use
rustc's own path printing to handle the impl IDs. This results in
the format changing a little bit; full paths are used in spots
where they aren't strictly necessary, and the path sometimes uses
generics where the old system used the trait's own name, but it
shouldn't matter since the orphan rules will prevent it anyway.
Helps with #90929
This changes the search results, specifically, when there's more than
one impl with an associated item with the same name. For example,
the search queries `simd<i8> -> simd<i8>` and `simd<i64> -> simd<i64>`
don't link to the same function, but most of the functions have the
same names.
This change should probably be FCP-ed, especially since it adds a new
anchor link format for `main.js` to handle, so that URLs like
`struct.Vec.html#impl-AsMut<[T]>-for-Vec<T,+A>/method.as_mut` redirect
to `struct.Vec.html#method.as_mut-2`. It's a strange design, but there
are a few reasons for it:
* I'd like to avoid making the HTML bigger. Obviously, fixing this bug
is going to add at least a little more data to the search index, but
adding more HTML penalises viewers for the benefit of searchers.
* Breaking `struct.Vec.html#method.len` would also be a disappointment.
On the other hand:
* The path-style anchors might be less prone to link rot than the numbered
anchors. It's definitely less likely to have URLs that appear to "work",
but silently point at the wrong thing.
* This commit arranges the path-style anchor to redirect to the numbered
anchor. Nothing stops rustdoc from doing the opposite, making path-style
anchors the default and redirecting the "legacy" numbered ones.
rustdoc: Render private fields in tuple struct as `/* private fields */`
Reopening of https://github.com/rust-lang/rust/pull/110552. All that was missing was a test for the different cases so I added it into the second commit.
Description from the original PR:
> I've gotten some feedback that the current rustdoc rendering of...
>
> ```
> struct HasPrivateFields(_);
> ```
>
> ...is confusing, and I agree with that feedback, especially compared to the field struct case:
>
> ```
> struct HasPrivateFields { /* private fields */ }
> ```
>
> So this PR makes it so that when all of the fields of a tuple variant are private, just render it with the `/* private fields */` comment. We can't *always* render it like that, for example when there's a mix of private and public fields.
cc ````@jsha````
r? ````@notriddle````
Skip rendering metadata strings from include_str!/include_bytes!
The const rendering code in rustdoc completely ignores consts from expansions, but the compiler was rendering all consts. So some consts (namely those from `include_bytes!`) were rendered then ignored.
Most of the diff here is from moving `print_const_expr` from rustdoc into `rustc_hir_pretty` so that it can be used in rustdoc and when building rmeta files.
This let's us handle a multitude of things for free:
- #[doc(hidden)]
- private fields/variants
- --document-private-items
- --document-hidden-items
And correct in the process the determination of "has stripped items" by
doing the same logic done by other ones.