rust/compiler/rustc_data_structures/src
bors 76c500ec6c Auto merge of #81635 - michaelwoerister:structured_def_path_hash, r=pnkfelix
Let a portion of DefPathHash uniquely identify the DefPath's crate.

This allows to directly map from a `DefPathHash` to the crate it originates from, without constructing side tables to do that mapping -- something that is useful for incremental compilation where we deal with `DefPathHash` instead of `DefId` a lot.

It also allows to reliably and cheaply check for `DefPathHash` collisions which allows the compiler to gracefully abort compilation instead of running into a subsequent ICE at some random place in the code.

The following new piece of documentation describes the most interesting aspects of the changes:

```rust
/// A `DefPathHash` is a fixed-size representation of a `DefPath` that is
/// stable across crate and compilation session boundaries. It consists of two
/// separate 64-bit hashes. The first uniquely identifies the crate this
/// `DefPathHash` originates from (see [StableCrateId]), and the second
/// uniquely identifies the corresponding `DefPath` within that crate. Together
/// they form a unique identifier within an entire crate graph.
///
/// There is a very small chance of hash collisions, which would mean that two
/// different `DefPath`s map to the same `DefPathHash`. Proceeding compilation
/// with such a hash collision would very probably lead to an ICE and, in the
/// worst case, to a silent mis-compilation. The compiler therefore actively
/// and exhaustively checks for such hash collisions and aborts compilation if
/// it finds one.
///
/// `DefPathHash` uses 64-bit hashes for both the crate-id part and the
/// crate-internal part, even though it is likely that there are many more
/// `LocalDefId`s in a single crate than there are individual crates in a crate
/// graph. Since we use the same number of bits in both cases, the collision
/// probability for the crate-local part will be quite a bit higher (though
/// still very small).
///
/// This imbalance is not by accident: A hash collision in the
/// crate-local part of a `DefPathHash` will be detected and reported while
/// compiling the crate in question. Such a collision does not depend on
/// outside factors and can be easily fixed by the crate maintainer (e.g. by
/// renaming the item in question or by bumping the crate version in a harmless
/// way).
///
/// A collision between crate-id hashes on the other hand is harder to fix
/// because it depends on the set of crates in the entire crate graph of a
/// compilation session. Again, using the same crate with a different version
/// number would fix the issue with a high probability -- but that might be
/// easier said then done if the crates in questions are dependencies of
/// third-party crates.
///
/// That being said, given a high quality hash function, the collision
/// probabilities in question are very small. For example, for a big crate like
/// `rustc_middle` (with ~50000 `LocalDefId`s as of the time of writing) there
/// is a probability of roughly 1 in 14,750,000,000 of a crate-internal
/// collision occurring. For a big crate graph with 1000 crates in it, there is
/// a probability of 1 in 36,890,000,000,000 of a `StableCrateId` collision.
```

Given the probabilities involved I hope that no one will ever actually see the error messages. Nonetheless, I'd be glad about some feedback on how to improve them. Should we create a GH issue describing the problem and possible solutions to point to? Or a page in the rustc book?

r? `@pnkfelix` (feel free to re-assign)
2021-03-07 23:45:57 +00:00
..
base_n
binary_search_util
graph Only initialize what is used 2021-02-10 09:20:41 +01:00
obligation_forest Turn Outcome into an opaque type to remove some runtime checks 2020-10-15 08:32:41 +02:00
owning_ref
sip128 SipHasher128: improve constant names and add more comments 2020-10-11 23:48:35 -07:00
small_c_str
snapshot_map
sorted_map Switch compiler/ to intra-doc links 2020-12-18 15:22:51 -05:00
sso Rollup merge of #78083 - ChaiTRex:master, r=m-ou-se 2020-12-19 15:15:57 +09:00
stable_hasher Stable hashing: add comments and tests concerning platform-independence 2020-09-30 00:57:35 -07:00
tagged_ptr Use T::BITS instead of size_of::<T> * 8. 2020-09-19 06:54:42 +02:00
tiny_list
transitive_relation
atomic_ref.rs
base_n.rs
box_region.rs
captures.rs
fingerprint.rs Add documentation to Unhasher impl for Fingerprint. 2021-02-04 10:37:11 +01:00
flock.rs
frozen.rs
functor.rs words 2020-11-16 22:42:09 +01:00
fx.rs
jobserver.rs datastructures: replace lazy_static by SyncLazy from std 2020-09-01 22:06:47 +01:00
lib.rs Rollup merge of #82057 - upsuper-forks:cstr, r=davidtwco,wesleywiser 2021-02-27 02:34:21 +01:00
macros.rs Remove unused static_assert macro 2020-09-20 11:40:51 +02:00
map_in_place.rs
profiling.rs Print -Ztime-passes (and misc stats/logs) on stderr, not stdout. 2021-02-18 14:13:38 +02:00
ptr_key.rs
sharded.rs Separate the query cache from the query state. 2021-02-13 21:14:58 +01:00
sip128.rs SipHasher128: improve constant names and add more comments 2020-10-11 23:48:35 -07:00
small_c_str.rs
sorted_map.rs Replace absolute paths with relative ones 2020-10-13 14:16:45 +02:00
stable_hasher.rs Enforce that query results implement Debug 2021-01-16 17:53:02 -05:00
stable_map.rs
stable_set.rs
stack.rs
steal.rs Auto merge of #80692 - Aaron1011:feature/query-result-debug, r=estebank 2021-01-26 05:47:23 +00:00
svh.rs
sync.rs Use RwLock instead of Lock for SourceMap::files 2020-10-29 18:09:53 +01:00
tagged_ptr.rs Fix typos 2020-10-29 16:51:46 +01:00
temp_dir.rs Capitalize safety comments 2020-09-08 22:37:18 -04:00
thin_vec.rs
tiny_list.rs
transitive_relation.rs Fix typos 2020-10-29 16:51:46 +01:00
unhash.rs Avoid rehashing Fingerprint as a map key 2020-09-01 18:27:02 -07:00
vec_linked_list.rs
work_queue.rs Remove unused code from remaining compiler crates 2020-10-14 04:14:32 +02:00