rust/tests/rustdoc-js-std
Michael Howell 86da4be47f rustdoc: use a trie for name-based search
Preview and profiler results
----------------------------

Here's some quick profiling in Firefox done on the rust compiler docs:

- Before: https://share.firefox.dev/3UPm3M8
- After: https://share.firefox.dev/40LXvYb

Here's the results for the node.js profiler:

- https://notriddle.com/rustdoc-html-demo-15/trie-perf/index.html

Here's a copy that you can use to try it out. Compare it with [the nightly].
Try typing `typecheckercontext` one character at a time, slowly.

- https://notriddle.com/rustdoc-html-demo-15/compiler-doc-trie/index.html

[the nightly]: https://doc.rust-lang.org/nightly/nightly-rustc/

The fuzzy match algo is based on [Fast String Correction with
Levenshtein-Automata] and the corresponding implementation code in [moman]
and [Lucene]; the bit-packing representation comes from Lucene, but the
actual matcher is more based on `fsc.py`. As suggested in the paper, a
trie is used to represent the FSA dictionary.

The same trie is used for prefix matching. Substring matching is done with a
side table of three-character[^1] windows that point into the trie.

[Fast String Correction with Levenshtein-Automata]: https://github.com/tpn/pdfs/blob/master/Fast%20String%20Correction%20with%20Levenshtein-Automata%20(2002)%20(10.1.1.16.652).pdf
[Lucene]: https://fossies.org/linux/lucene/lucene/core/src/java/org/apache/lucene/util/automaton/Lev1TParametricDescription.java
[moman]: https://gitlab.com/notriddle/moman-rustdoc

User-visible changes
--------------------

I don't expect anybody to notice anything, but it does cause two changes:

- Substring matches, in the middle of a name, only apply if there's three
  or more characters in the search query.
- Levenshtein distance limit now maxes out at two. In the old version,
  the limit was w/3, so you could get looser matches for queries with
  9 or more characters[^1] in them.

[^1]: technically utf-16 code units
2024-11-13 12:04:46 -07:00
..
alias-1.js rustdoc-search: fix description on aliases in results 2024-04-18 22:21:29 -07:00
alias-2.js Update rustdoc-js* format 2023-06-09 17:00:47 +02:00
alias-3.js Update rustdoc-js* format 2023-06-09 17:00:47 +02:00
alias-4.js Update rustdoc-js* format 2023-06-09 17:00:47 +02:00
alias.js Update rustdoc-js* format 2023-06-09 17:00:47 +02:00
asrawfd.js rustdoc-search: count path edits with separate edit limit 2023-12-26 18:46:17 -07:00
basic.js Update rustdoc-js* format 2023-06-09 17:00:47 +02:00
bufread-fill-buf.js rustdoc-search: simplify rules for generics and type params 2024-10-30 12:27:48 -07:00
deduplication.js rustdoc: use a trie for name-based search 2024-11-13 12:04:46 -07:00
enum-option.js Update rustdoc-js* format 2023-06-09 17:00:47 +02:00
exact-case.js rustdoc: show exact case-sensitive matches first 2024-08-23 13:05:24 -04:00
filter-crate.js Update rustdoc-js* format 2023-06-09 17:00:47 +02:00
fn-forget.js Update rustdoc-js* format 2023-06-09 17:00:47 +02:00
from_u.js Update rustdoc-js* format 2023-06-09 17:00:47 +02:00
full-path-function.js Add tests for type-based search 2023-09-01 15:16:11 +02:00
iterator-type-signatures.js rustdoc-search: add support for associated types 2023-11-19 18:54:36 -07:00
keyword.js rustdoc-search: make primitives and keywords less special 2023-11-21 13:59:26 -07:00
macro-check.js rustdoc-search: make primitives and keywords less special 2023-11-21 13:59:26 -07:00
macro-print.js Update rustdoc-js* format 2023-06-09 17:00:47 +02:00
never.js rustdoc-search: search never type with ! 2023-06-12 17:30:23 -07:00
option-type-signatures.js rustdoc-search: simplify rules for generics and type params 2024-10-30 12:27:48 -07:00
osstring-to-string.js Add display method to OsStr 2024-01-18 20:38:31 +00:00
parser-bindings.js rustdoc-search: pass original names through AST 2024-10-30 10:35:38 -07:00
parser-errors.js rustdoc-search: pass original names through AST 2024-10-30 10:35:38 -07:00
parser-filter.js rustdoc-search: pass original names through AST 2024-10-30 10:35:38 -07:00
parser-generics.js rustdoc-search: pass original names through AST 2024-10-30 10:35:38 -07:00
parser-hof.js rustdoc-search: pass original names through AST 2024-10-30 10:35:38 -07:00
parser-ident.js rustdoc-search: pass original names through AST 2024-10-30 10:35:38 -07:00
parser-literal.js rustdoc-search: pass original names through AST 2024-10-30 10:35:38 -07:00
parser-paths.js rustdoc-search: pass original names through AST 2024-10-30 10:35:38 -07:00
parser-quote.js rustdoc-search: pass original names through AST 2024-10-30 10:35:38 -07:00
parser-reference.js rustdoc-search: pass original names through AST 2024-10-30 10:35:38 -07:00
parser-returned.js rustdoc-search: pass original names through AST 2024-10-30 10:35:38 -07:00
parser-separators.js rustdoc-search: pass original names through AST 2024-10-30 10:35:38 -07:00
parser-slice-array.js rustdoc-search: pass original names through AST 2024-10-30 10:35:38 -07:00
parser-tuple.js rustdoc-search: pass original names through AST 2024-10-30 10:35:38 -07:00
parser-weird-queries.js rustdoc-search: pass original names through AST 2024-10-30 10:35:38 -07:00
path-maxeditdistance.js rustdoc: use a trie for name-based search 2024-11-13 12:04:46 -07:00
path-ordering.js Fix rustdoc-js-std path-ordering test due to API removal 2024-10-02 11:15:48 +02:00
primitive.js Update rustdoc-js* format 2023-06-09 17:00:47 +02:00
println-typo.js Update rustdoc-js* format 2023-06-09 17:00:47 +02:00
quoted.js Update rustdoc-js* format 2023-06-09 17:00:47 +02:00
reference-shrink.js Update rustdoc-js* format 2023-06-09 17:00:47 +02:00
regex.js Update rustdoc-js* format 2023-06-09 17:00:47 +02:00
return-specific-literal.js Update rustdoc-js* format 2023-06-09 17:00:47 +02:00
return-specific.js Update rustdoc-js* format 2023-06-09 17:00:47 +02:00
should-fail.js Update rustdoc-js* format 2023-06-09 17:00:47 +02:00
simd-type-signatures.js Adjust ranking so that duplicates count against rank 2024-10-31 13:12:14 -07:00
string-from_ut.js Update rustdoc-js* format 2023-06-09 17:00:47 +02:00
struct-vec.js Update rustdoc-js* format 2023-06-09 17:00:47 +02:00
transmute-fail.js Adjust ranking so that duplicates count against rank 2024-10-31 13:12:14 -07:00
transmute.js Adjust ranking so that duplicates count against rank 2024-10-31 13:12:14 -07:00
typed-query.js Update rustdoc-js* format 2023-06-09 17:00:47 +02:00
vec-new.js rustdoc-search: single result for items with multiple paths 2024-04-08 17:07:14 -07:00
vec-type-signatures.js rustdoc-search: simplify rules for generics and type params 2024-10-30 12:27:48 -07:00