rust/tests/rustdoc-js
Michael Howell 86da4be47f rustdoc: use a trie for name-based search
Preview and profiler results
----------------------------

Here's some quick profiling in Firefox done on the rust compiler docs:

- Before: https://share.firefox.dev/3UPm3M8
- After: https://share.firefox.dev/40LXvYb

Here's the results for the node.js profiler:

- https://notriddle.com/rustdoc-html-demo-15/trie-perf/index.html

Here's a copy that you can use to try it out. Compare it with [the nightly].
Try typing `typecheckercontext` one character at a time, slowly.

- https://notriddle.com/rustdoc-html-demo-15/compiler-doc-trie/index.html

[the nightly]: https://doc.rust-lang.org/nightly/nightly-rustc/

The fuzzy match algo is based on [Fast String Correction with
Levenshtein-Automata] and the corresponding implementation code in [moman]
and [Lucene]; the bit-packing representation comes from Lucene, but the
actual matcher is more based on `fsc.py`. As suggested in the paper, a
trie is used to represent the FSA dictionary.

The same trie is used for prefix matching. Substring matching is done with a
side table of three-character[^1] windows that point into the trie.

[Fast String Correction with Levenshtein-Automata]: https://github.com/tpn/pdfs/blob/master/Fast%20String%20Correction%20with%20Levenshtein-Automata%20(2002)%20(10.1.1.16.652).pdf
[Lucene]: https://fossies.org/linux/lucene/lucene/core/src/java/org/apache/lucene/util/automaton/Lev1TParametricDescription.java
[moman]: https://gitlab.com/notriddle/moman-rustdoc

User-visible changes
--------------------

I don't expect anybody to notice anything, but it does cause two changes:

- Substring matches, in the middle of a name, only apply if there's three
  or more characters in the search query.
- Levenshtein distance limit now maxes out at two. In the old version,
  the limit was w/3, so you could get looser matches for queries with
  9 or more characters[^1] in them.

[^1]: technically utf-16 code units
2024-11-13 12:04:46 -07:00
..
auxiliary rustfmt tests/rustdoc-js/. 2024-06-04 14:15:06 +10:00
assoc-type-backtrack.js rustdoc-search: simplify rules for generics and type params 2024-10-30 12:27:48 -07:00
assoc-type-backtrack.rs rustdoc-search: simplify rules for generics and type params 2024-10-30 12:27:48 -07:00
assoc-type-loop.js rustdoc-search: avoid infinite where clause unbox 2023-11-24 10:42:11 -07:00
assoc-type-loop.rs rustfmt tests/rustdoc-js/. 2024-06-04 14:15:06 +10:00
assoc-type-unbound.js rustdoc-search: simplify rules for generics and type params 2024-10-30 12:27:48 -07:00
assoc-type-unbound.rs rustdoc-search: show types signatures in results 2024-10-30 10:35:39 -07:00
assoc-type.js rustdoc-search: show types signatures in results 2024-10-30 10:35:39 -07:00
assoc-type.rs rustdoc-search: simplify rules for generics and type params 2024-10-30 12:27:48 -07:00
basic.js Add test for PR #126057 2024-06-07 05:49:46 +08:00
basic.rs
big-result.js rustdoc-search: use set ops for ranking and filtering 2023-12-13 10:37:15 -07:00
big-result.rs rustdoc-search: use set ops for ranking and filtering 2023-12-13 10:37:15 -07:00
doc-alias-filter-out.js Update rustdoc-js* format 2023-06-09 17:00:47 +02:00
doc-alias-filter-out.rs
doc-alias-filter.js Update rustdoc-js* format 2023-06-09 17:00:47 +02:00
doc-alias-filter.rs
doc-alias-whitespace.js Update rustdoc-js* format 2023-06-09 17:00:47 +02:00
doc-alias-whitespace.rs
doc-alias.js Add test for PR #126057 2024-06-07 05:49:46 +08:00
doc-alias.rs Add test for PR #126057 2024-06-07 05:49:46 +08:00
enum-variant-not-type.js rustdoc-search: do not treat associated type names as types 2023-12-10 16:52:21 -07:00
enum-variant-not-type.rs rustfmt tests/rustdoc-js/. 2024-06-04 14:15:06 +10:00
exact-match.js rustdoc-search: count path edits with separate edit limit 2023-12-26 18:46:17 -07:00
exact-match.rs
extern-func.js allow type-based search on foreign functions 2024-10-25 12:19:04 -05:00
extern-func.rs allow type-based search on foreign functions 2024-10-25 12:19:04 -05:00
foreign-type-path.js Update rustdoc-js* format 2023-06-09 17:00:47 +02:00
foreign-type-path.rs rustfmt tests/rustdoc-js/. 2024-06-04 14:15:06 +10:00
full-path-function.js rustdoc-search: use set ops for ranking and filtering 2023-12-13 10:37:15 -07:00
full-path-function.rs rustfmt tests/rustdoc-js/. 2024-06-04 14:15:06 +10:00
gat.js rustdoc-search: add support for associated types 2023-11-19 18:54:36 -07:00
gat.rs rustfmt tests/rustdoc-js/. 2024-06-04 14:15:06 +10:00
generics2.js rustdoc-search: fix accidental shared, mutable map 2023-11-17 18:22:31 -07:00
generics2.rs rustdoc-search: fix accidental shared, mutable map 2023-11-17 18:22:31 -07:00
generics-impl.js rustdoc-search: simplify rules for generics and type params 2024-10-30 12:27:48 -07:00
generics-impl.rs rustdoc-search: simplify rules for generics and type params 2024-10-30 12:27:48 -07:00
generics-match-ambiguity-no-unbox.js rustdoc-search: simplify rules for generics and type params 2024-10-30 12:27:48 -07:00
generics-match-ambiguity-no-unbox.rs rustdoc-search: simplify rules for generics and type params 2024-10-30 12:27:48 -07:00
generics-match-ambiguity.js rustdoc-search: simplify rules for generics and type params 2024-10-30 12:27:48 -07:00
generics-match-ambiguity.rs rustdoc-search: simplify rules for generics and type params 2024-10-30 12:27:48 -07:00
generics-multi-trait.js Update rustdoc-js* format 2023-06-09 17:00:47 +02:00
generics-multi-trait.rs
generics-nested.js rustdoc-search: simplify rules for generics and type params 2024-10-30 12:27:48 -07:00
generics-nested.rs rustdoc-search: add support for nested generics 2023-04-14 14:55:45 -07:00
generics-trait.js rustdoc-search: show types signatures in results 2024-10-30 10:35:39 -07:00
generics-trait.rs rustfmt tests/rustdoc-js/. 2024-06-04 14:15:06 +10:00
generics-unbox.js rustdoc-search: simplify rules for generics and type params 2024-10-30 12:27:48 -07:00
generics-unbox.rs rustdoc-search: simplify rules for generics and type params 2024-10-30 12:27:48 -07:00
generics.js rustdoc-search: simplify rules for generics and type params 2024-10-30 12:27:48 -07:00
generics.rs rustfmt tests/rustdoc-js/. 2024-06-04 14:15:06 +10:00
hof.js rustdoc-search: simplify rules for generics and type params 2024-10-30 12:27:48 -07:00
hof.rs rustfmt tests/rustdoc-js/. 2024-06-04 14:15:06 +10:00
impl-trait.js Adjust ranking so that duplicates count against rank 2024-10-31 13:12:14 -07:00
impl-trait.rs rustdoc-search: fix bug with multi-item impl trait 2023-10-05 22:32:37 -07:00
looks-like-rustc-interner.js rustdoc-search: simplify rules for generics and type params 2024-10-30 12:27:48 -07:00
looks-like-rustc-interner.rs rustdoc-search: stress test for associated types 2024-03-11 09:20:49 -07:00
macro-search.js Update rustdoc-js* format 2023-06-09 17:00:47 +02:00
macro-search.rs rustfmt tests/rustdoc-js/. 2024-06-04 14:15:06 +10:00
module-substring.js rustdoc-search: count path edits with separate edit limit 2023-12-26 18:46:17 -07:00
module-substring.rs
nested-unboxed.js rustdoc-search: simplify rules for generics and type params 2024-10-30 12:27:48 -07:00
nested-unboxed.rs rustdoc-search: simplify rules for generics and type params 2024-10-30 12:27:48 -07:00
never-search.js rustdoc-search: allow trailing Foo -> arg search 2024-09-05 17:58:05 -07:00
never-search.rs rustfmt tests/rustdoc-js/. 2024-06-04 14:15:06 +10:00
non-english-identifier.js rustdoc: use a trie for name-based search 2024-11-13 12:04:46 -07:00
non-english-identifier.rs Add test for PR #126057 2024-06-07 05:49:46 +08:00
path-maxeditdistance.js rustdoc: use a trie for name-based search 2024-11-13 12:04:46 -07:00
path-maxeditdistance.rs rustfmt tests/rustdoc-js/. 2024-06-04 14:15:06 +10:00
path-ordering.js rustdoc-search: count path edits with separate edit limit 2023-12-26 18:46:17 -07:00
path-ordering.rs rustdoc-search: count path edits with separate edit limit 2023-12-26 18:46:17 -07:00
primitive.js Update rustdoc-js* format 2023-06-09 17:00:47 +02:00
primitive.rs
prototype.js rustdoc: use a trie for name-based search 2024-11-13 12:04:46 -07:00
prototype.rs rustfmt tests/rustdoc-js/. 2024-06-04 14:15:06 +10:00
raw-pointer.js Update rustdoc-js* format 2023-06-09 17:00:47 +02:00
raw-pointer.rs
reexport-dedup-macro.js rustdoc-search: single result for items with multiple paths 2024-04-08 17:07:14 -07:00
reexport-dedup-macro.rs rustfmt tests/rustdoc-js/. 2024-06-04 14:15:06 +10:00
reexport-dedup-method.js rustdoc-search: single result for items with multiple paths 2024-04-08 17:07:14 -07:00
reexport-dedup-method.rs rustfmt tests/rustdoc-js/. 2024-06-04 14:15:06 +10:00
reexport-dedup.js rustdoc-search: single result for items with multiple paths 2024-04-08 17:07:14 -07:00
reexport-dedup.rs rustfmt tests/rustdoc-js/. 2024-06-04 14:15:06 +10:00
reexport.js Update rustdoc-js* format 2023-06-09 17:00:47 +02:00
reexport.rs rustfmt tests/rustdoc-js/. 2024-06-04 14:15:06 +10:00
reference.js rustdoc-search: simplify rules for generics and type params 2024-10-30 12:27:48 -07:00
reference.rs rustfmt tests/rustdoc-js/. 2024-06-04 14:15:06 +10:00
search-bag-semantics.js Update rustdoc-js* format 2023-06-09 17:00:47 +02:00
search-bag-semantics.rs rustdoc: implement bag semantics for function parameter search 2023-03-19 18:19:24 -07:00
search-method-disambiguate.js rustdoc-search: add impl disambiguator to duplicate assoc items 2023-09-21 15:16:44 -07:00
search-method-disambiguate.rs rustdoc-search: add impl disambiguator to duplicate assoc items 2023-09-21 15:16:44 -07:00
search-non-local-trait-impl.js Add regression test for #115480 2023-10-11 11:41:39 +02:00
search-non-local-trait-impl.rs [AUTO_GENERATED] Migrate compiletest to use ui_test-style //@ directives 2024-02-22 16:04:04 +00:00
search-short-types.js Update rustdoc-js* format 2023-06-09 17:00:47 +02:00
search-short-types.rs
self-is-not-generic.js Add test for Self not being a generic in search index 2024-08-04 12:49:28 -07:00
self-is-not-generic.rs Add test for Self not being a generic in search index 2024-08-04 12:49:28 -07:00
slice-array.js rustdoc: add note about slice/array searches to help popup 2023-06-10 14:08:26 -07:00
slice-array.rs rustfmt tests/rustdoc-js/. 2024-06-04 14:15:06 +10:00
struct-like-variant.js Update rustdoc-js* format 2023-06-09 17:00:47 +02:00
struct-like-variant.rs rustfmt tests/rustdoc-js/. 2024-06-04 14:15:06 +10:00
substring.js rustdoc-search: remove parallel searchWords array 2023-12-15 16:26:35 -07:00
substring.rs rustdoc-search: remove parallel searchWords array 2023-12-15 16:26:35 -07:00
summaries.js Update rustdoc-js* format 2023-06-09 17:00:47 +02:00
summaries.rs rustfmt tests/rustdoc-js/. 2024-06-04 14:15:06 +10:00
trait-methods.js rustdoc-search: add support for associated types 2023-11-19 18:54:36 -07:00
trait-methods.rs rustdoc-search: add support for associated types 2023-11-19 18:54:36 -07:00
tuple-unit.js rustdoc-search: simplify rules for generics and type params 2024-10-30 12:27:48 -07:00
tuple-unit.rs rustfmt tests/rustdoc-js/. 2024-06-04 14:15:06 +10:00
type-parameters.js Adjust ranking so that duplicates count against rank 2024-10-31 13:12:14 -07:00
type-parameters.rs rustfmt tests/rustdoc-js/. 2024-06-04 14:15:06 +10:00
underscoredtype.js rustdoc-search: use lowercase, non-normalized name for type search 2024-06-09 11:56:52 -07:00
underscoredtype.rs rustdoc-search: use lowercase, non-normalized name for type search 2024-06-09 11:56:52 -07:00
where-clause.js Update rustdoc-js* format 2023-06-09 17:00:47 +02:00
where-clause.rs rustfmt tests/rustdoc-js/. 2024-06-04 14:15:06 +10:00