PR feedback implementation

This commit is contained in:
Josh Mcguigan 2020-04-06 15:38:20 -07:00
parent da6752d5f9
commit a208de15b7

View File

@ -2,7 +2,191 @@
//! for match arms.
//!
//! It is modeled on the rustc module `librustc_mir_build::hair::pattern::_match`, which
//! contains very detailed documentation about the algorithms used here.
//! contains very detailed documentation about the algorithms used here. I've duplicated
//! most of that documentation below.
//!
//! This file includes the logic for exhaustiveness and usefulness checking for
//! pattern-matching. Specifically, given a list of patterns for a type, we can
//! tell whether:
//! (a) the patterns cover every possible constructor for the type [exhaustiveness]
//! (b) each pattern is necessary [usefulness]
//!
//! The algorithm implemented here is a modified version of the one described in:
//! http://moscova.inria.fr/~maranget/papers/warn/index.html
//! However, to save future implementors from reading the original paper, we
//! summarise the algorithm here to hopefully save time and be a little clearer
//! (without being so rigorous).
//!
//! The core of the algorithm revolves about a "usefulness" check. In particular, we
//! are trying to compute a predicate `U(P, p)` where `P` is a list of patterns (we refer to this as
//! a matrix). `U(P, p)` represents whether, given an existing list of patterns
//! `P_1 ..= P_m`, adding a new pattern `p` will be "useful" (that is, cover previously-
//! uncovered values of the type).
//!
//! If we have this predicate, then we can easily compute both exhaustiveness of an
//! entire set of patterns and the individual usefulness of each one.
//! (a) the set of patterns is exhaustive iff `U(P, _)` is false (i.e., adding a wildcard
//! match doesn't increase the number of values we're matching)
//! (b) a pattern `P_i` is not useful if `U(P[0..=(i-1), P_i)` is false (i.e., adding a
//! pattern to those that have come before it doesn't increase the number of values
//! we're matching).
//!
//! During the course of the algorithm, the rows of the matrix won't just be individual patterns,
//! but rather partially-deconstructed patterns in the form of a list of patterns. The paper
//! calls those pattern-vectors, and we will call them pattern-stacks. The same holds for the
//! new pattern `p`.
//!
//! For example, say we have the following:
//! ```
//! // x: (Option<bool>, Result<()>)
//! match x {
//! (Some(true), _) => {}
//! (None, Err(())) => {}
//! (None, Err(_)) => {}
//! }
//! ```
//! Here, the matrix `P` starts as:
//! [
//! [(Some(true), _)],
//! [(None, Err(()))],
//! [(None, Err(_))],
//! ]
//! We can tell it's not exhaustive, because `U(P, _)` is true (we're not covering
//! `[(Some(false), _)]`, for instance). In addition, row 3 is not useful, because
//! all the values it covers are already covered by row 2.
//!
//! A list of patterns can be thought of as a stack, because we are mainly interested in the top of
//! the stack at any given point, and we can pop or apply constructors to get new pattern-stacks.
//! To match the paper, the top of the stack is at the beginning / on the left.
//!
//! There are two important operations on pattern-stacks necessary to understand the algorithm:
//! 1. We can pop a given constructor off the top of a stack. This operation is called
//! `specialize`, and is denoted `S(c, p)` where `c` is a constructor (like `Some` or
//! `None`) and `p` a pattern-stack.
//! If the pattern on top of the stack can cover `c`, this removes the constructor and
//! pushes its arguments onto the stack. It also expands OR-patterns into distinct patterns.
//! Otherwise the pattern-stack is discarded.
//! This essentially filters those pattern-stacks whose top covers the constructor `c` and
//! discards the others.
//!
//! For example, the first pattern above initially gives a stack `[(Some(true), _)]`. If we
//! pop the tuple constructor, we are left with `[Some(true), _]`, and if we then pop the
//! `Some` constructor we get `[true, _]`. If we had popped `None` instead, we would get
//! nothing back.
//!
//! This returns zero or more new pattern-stacks, as follows. We look at the pattern `p_1`
//! on top of the stack, and we have four cases:
//! 1.1. `p_1 = c(r_1, .., r_a)`, i.e. the top of the stack has constructor `c`. We
//! push onto the stack the arguments of this constructor, and return the result:
//! r_1, .., r_a, p_2, .., p_n
//! 1.2. `p_1 = c'(r_1, .., r_a')` where `c ≠ c'`. We discard the current stack and
//! return nothing.
//! 1.3. `p_1 = _`. We push onto the stack as many wildcards as the constructor `c` has
//! arguments (its arity), and return the resulting stack:
//! _, .., _, p_2, .., p_n
//! 1.4. `p_1 = r_1 | r_2`. We expand the OR-pattern and then recurse on each resulting
//! stack:
//! S(c, (r_1, p_2, .., p_n))
//! S(c, (r_2, p_2, .., p_n))
//!
//! 2. We can pop a wildcard off the top of the stack. This is called `D(p)`, where `p` is
//! a pattern-stack.
//! This is used when we know there are missing constructor cases, but there might be
//! existing wildcard patterns, so to check the usefulness of the matrix, we have to check
//! all its *other* components.
//!
//! It is computed as follows. We look at the pattern `p_1` on top of the stack,
//! and we have three cases:
//! 1.1. `p_1 = c(r_1, .., r_a)`. We discard the current stack and return nothing.
//! 1.2. `p_1 = _`. We return the rest of the stack:
//! p_2, .., p_n
//! 1.3. `p_1 = r_1 | r_2`. We expand the OR-pattern and then recurse on each resulting
//! stack.
//! D((r_1, p_2, .., p_n))
//! D((r_2, p_2, .., p_n))
//!
//! Note that the OR-patterns are not always used directly in Rust, but are used to derive the
//! exhaustive integer matching rules, so they're written here for posterity.
//!
//! Both those operations extend straightforwardly to a list or pattern-stacks, i.e. a matrix, by
//! working row-by-row. Popping a constructor ends up keeping only the matrix rows that start with
//! the given constructor, and popping a wildcard keeps those rows that start with a wildcard.
//!
//!
//! The algorithm for computing `U`
//! -------------------------------
//! The algorithm is inductive (on the number of columns: i.e., components of tuple patterns).
//! That means we're going to check the components from left-to-right, so the algorithm
//! operates principally on the first component of the matrix and new pattern-stack `p`.
//! This algorithm is realised in the `is_useful` function.
//!
//! Base case. (`n = 0`, i.e., an empty tuple pattern)
//! - If `P` already contains an empty pattern (i.e., if the number of patterns `m > 0`),
//! then `U(P, p)` is false.
//! - Otherwise, `P` must be empty, so `U(P, p)` is true.
//!
//! Inductive step. (`n > 0`, i.e., whether there's at least one column
//! [which may then be expanded into further columns later])
//! We're going to match on the top of the new pattern-stack, `p_1`.
//! - If `p_1 == c(r_1, .., r_a)`, i.e. we have a constructor pattern.
//! Then, the usefulness of `p_1` can be reduced to whether it is useful when
//! we ignore all the patterns in the first column of `P` that involve other constructors.
//! This is where `S(c, P)` comes in:
//! `U(P, p) := U(S(c, P), S(c, p))`
//! This special case is handled in `is_useful_specialized`.
//!
//! For example, if `P` is:
//! [
//! [Some(true), _],
//! [None, 0],
//! ]
//! and `p` is [Some(false), 0], then we don't care about row 2 since we know `p` only
//! matches values that row 2 doesn't. For row 1 however, we need to dig into the
//! arguments of `Some` to know whether some new value is covered. So we compute
//! `U([[true, _]], [false, 0])`.
//!
//! - If `p_1 == _`, then we look at the list of constructors that appear in the first
//! component of the rows of `P`:
//! + If there are some constructors that aren't present, then we might think that the
//! wildcard `_` is useful, since it covers those constructors that weren't covered
//! before.
//! That's almost correct, but only works if there were no wildcards in those first
//! components. So we need to check that `p` is useful with respect to the rows that
//! start with a wildcard, if there are any. This is where `D` comes in:
//! `U(P, p) := U(D(P), D(p))`
//!
//! For example, if `P` is:
//! [
//! [_, true, _],
//! [None, false, 1],
//! ]
//! and `p` is [_, false, _], the `Some` constructor doesn't appear in `P`. So if we
//! only had row 2, we'd know that `p` is useful. However row 1 starts with a
//! wildcard, so we need to check whether `U([[true, _]], [false, 1])`.
//!
//! + Otherwise, all possible constructors (for the relevant type) are present. In this
//! case we must check whether the wildcard pattern covers any unmatched value. For
//! that, we can think of the `_` pattern as a big OR-pattern that covers all
//! possible constructors. For `Option`, that would mean `_ = None | Some(_)` for
//! example. The wildcard pattern is useful in this case if it is useful when
//! specialized to one of the possible constructors. So we compute:
//! `U(P, p) := ∃(k ϵ constructors) U(S(k, P), S(k, p))`
//!
//! For example, if `P` is:
//! [
//! [Some(true), _],
//! [None, false],
//! ]
//! and `p` is [_, false], both `None` and `Some` constructors appear in the first
//! components of `P`. We will therefore try popping both constructors in turn: we
//! compute U([[true, _]], [_, false]) for the `Some` constructor, and U([[false]],
//! [false]) for the `None` constructor. The first case returns true, so we know that
//! `p` is useful for `P`. Indeed, it matches `[Some(false), _]` that wasn't matched
//! before.
//!
//! - If `p_1 == r_1 | r_2`, then the usefulness depends on each `r_i` separately:
//! `U(P, p) := U(P, (r_1, p_2, .., p_n))
//! || U(P, (r_2, p_2, .., p_n))`
use std::sync::Arc;
use smallvec::{smallvec, SmallVec};
@ -134,15 +318,25 @@ impl PatStack {
) -> MatchCheckResult<Option<PatStack>> {
let result = match (self.head().as_pat(cx), constructor) {
(Pat::Tuple(ref pat_ids), Constructor::Tuple { arity }) => {
if pat_ids.len() != *arity {
None
} else {
Some(self.replace_head_with(pat_ids))
}
debug_assert_eq!(
pat_ids.len(),
*arity,
"we type check before calling this code, so we should never hit this case",
);
Some(self.replace_head_with(pat_ids))
}
(Pat::Lit(_), Constructor::Bool(_)) => {
// for now we only support bool literals
Some(self.to_tail())
(Pat::Lit(lit_expr), Constructor::Bool(constructor_val)) => {
match cx.body.exprs[lit_expr] {
Expr::Literal(Literal::Bool(pat_val)) if *constructor_val == pat_val => {
Some(self.to_tail())
}
// it was a bool but the value doesn't match
Expr::Literal(Literal::Bool(_)) => None,
// perhaps this is actually unreachable given we have
// already checked that these match arms have the appropriate type?
_ => return Err(MatchCheckNotImplemented),
}
}
(Pat::Wild, constructor) => Some(self.expand_wildcard(cx, constructor)?),
(Pat::Path(_), Constructor::Enum(constructor)) => {
@ -162,7 +356,7 @@ impl PatStack {
Some(self.replace_head_with(pat_ids))
}
}
(Pat::Or(_), _) => unreachable!("we desugar or patterns so this should never happen"),
(Pat::Or(_), _) => return Err(MatchCheckNotImplemented),
(_, _) => return Err(MatchCheckNotImplemented),
};
@ -186,19 +380,8 @@ impl PatStack {
);
let mut patterns: PatStackInner = smallvec![];
let arity = match constructor {
Constructor::Bool(_) => 0,
Constructor::Tuple { arity } => *arity,
Constructor::Enum(e) => {
match cx.db.enum_data(e.parent).variants[e.local_id].variant_data.as_ref() {
VariantData::Tuple(struct_field_data) => struct_field_data.len(),
VariantData::Unit => 0,
_ => return Err(MatchCheckNotImplemented),
}
}
};
for _ in 0..arity {
for _ in 0..constructor.arity(cx)? {
patterns.push(PatIdOrWild::Wild);
}
@ -368,46 +551,23 @@ pub(crate) fn is_useful(
// constructors are covered (`Some`/`None`), so we need
// to perform specialization to see that our wildcard will cover
// the `Some(false)` case.
let mut constructor = None;
for pat in matrix.heads() {
if let Some(c) = pat_constructor(cx, pat)? {
constructor = Some(c);
break;
}
//
// Here we create a constructor for each variant and then check
// usefulness after specializing for that constructor.
let mut found_unimplemented = false;
for constructor in constructor.all_constructors(cx) {
let matrix = matrix.specialize_constructor(&cx, &constructor)?;
let v = v.expand_wildcard(&cx, &constructor)?;
match is_useful(&cx, &matrix, &v) {
Ok(Usefulness::Useful) => return Ok(Usefulness::Useful),
Ok(Usefulness::NotUseful) => continue,
_ => found_unimplemented = true,
};
}
if let Some(constructor) = constructor {
if let Constructor::Enum(e) = constructor {
// For enums we handle each variant as a distinct constructor, so
// here we create a constructor for each variant and then check
// usefulness after specializing for that constructor.
let mut found_unimplemented = false;
for constructor in
cx.db.enum_data(e.parent).variants.iter().map(|(local_id, _)| {
Constructor::Enum(EnumVariantId { parent: e.parent, local_id })
})
{
let matrix = matrix.specialize_constructor(&cx, &constructor)?;
let v = v.expand_wildcard(&cx, &constructor)?;
match is_useful(&cx, &matrix, &v) {
Ok(Usefulness::Useful) => return Ok(Usefulness::Useful),
Ok(Usefulness::NotUseful) => continue,
_ => found_unimplemented = true,
};
}
if found_unimplemented {
Err(MatchCheckNotImplemented)
} else {
Ok(Usefulness::NotUseful)
}
} else {
let matrix = matrix.specialize_constructor(&cx, &constructor)?;
let v = v.expand_wildcard(&cx, &constructor)?;
is_useful(&cx, &matrix, &v)
}
if found_unimplemented {
Err(MatchCheckNotImplemented)
} else {
Ok(Usefulness::NotUseful)
}
@ -425,7 +585,7 @@ pub(crate) fn is_useful(
}
}
#[derive(Debug)]
#[derive(Debug, Clone, Copy)]
/// Similar to TypeCtor, but includes additional information about the specific
/// value being instantiated. For example, TypeCtor::Bool doesn't contain the
/// boolean value.
@ -435,6 +595,40 @@ enum Constructor {
Enum(EnumVariantId),
}
impl Constructor {
fn arity(&self, cx: &MatchCheckCtx) -> MatchCheckResult<usize> {
let arity = match self {
Constructor::Bool(_) => 0,
Constructor::Tuple { arity } => *arity,
Constructor::Enum(e) => {
match cx.db.enum_data(e.parent).variants[e.local_id].variant_data.as_ref() {
VariantData::Tuple(struct_field_data) => struct_field_data.len(),
VariantData::Unit => 0,
_ => return Err(MatchCheckNotImplemented),
}
}
};
Ok(arity)
}
fn all_constructors(&self, cx: &MatchCheckCtx) -> Vec<Constructor> {
match self {
Constructor::Bool(_) => vec![Constructor::Bool(true), Constructor::Bool(false)],
Constructor::Tuple { .. } => vec![*self],
Constructor::Enum(e) => cx
.db
.enum_data(e.parent)
.variants
.iter()
.map(|(local_id, _)| {
Constructor::Enum(EnumVariantId { parent: e.parent, local_id })
})
.collect(),
}
}
}
/// Returns the constructor for the given pattern. Should only return None
/// in the case of a Wild pattern.
fn pat_constructor(cx: &MatchCheckCtx, pat: PatIdOrWild) -> MatchCheckResult<Option<Constructor>> {
@ -501,14 +695,7 @@ fn all_constructors_covered(
}
fn enum_variant_matches(cx: &MatchCheckCtx, pat_id: PatId, enum_variant_id: EnumVariantId) -> bool {
if let Some(VariantId::EnumVariantId(pat_variant_id)) =
cx.infer.variant_resolution_for_pat(pat_id)
{
if pat_variant_id.local_id == enum_variant_id.local_id {
return true;
}
}
false
Some(enum_variant_id.into()) == cx.infer.variant_resolution_for_pat(pat_id)
}
#[cfg(test)]
@ -522,10 +709,10 @@ mod tests {
TestDB::with_single_file(content).0.diagnostics().0
}
pub(super) fn check_diagnostic_with_no_fix(content: &str) {
pub(super) fn check_diagnostic(content: &str) {
let diagnostic_count = TestDB::with_single_file(content).0.diagnostics().1;
assert_eq!(1, diagnostic_count, "no diagnotic reported");
assert_eq!(1, diagnostic_count, "no diagnostic reported");
}
pub(super) fn check_no_diagnostic(content: &str) {
@ -558,7 +745,7 @@ mod tests {
}
";
check_diagnostic_with_no_fix(content);
check_diagnostic(content);
}
#[test]
@ -596,7 +783,7 @@ mod tests {
}
";
check_diagnostic_with_no_fix(content);
check_diagnostic(content);
}
#[test]
@ -621,7 +808,7 @@ mod tests {
}
";
check_diagnostic_with_no_fix(content);
check_diagnostic(content);
}
#[test]
@ -646,7 +833,7 @@ mod tests {
}
";
check_diagnostic_with_no_fix(content);
check_diagnostic(content);
}
#[test]
@ -659,7 +846,7 @@ mod tests {
}
";
check_diagnostic_with_no_fix(content);
check_diagnostic(content);
}
#[test]
@ -685,7 +872,7 @@ mod tests {
}
";
check_diagnostic_with_no_fix(content);
check_diagnostic(content);
}
#[test]
@ -698,7 +885,37 @@ mod tests {
}
";
check_diagnostic_with_no_fix(content);
check_diagnostic(content);
}
#[test]
fn tuple_of_bools_missing_arm() {
let content = r"
fn test_fn() {
match (false, true) {
(false, true) => {},
(false, false) => {},
(true, false) => {},
}
}
";
check_diagnostic(content);
}
#[test]
fn tuple_of_bools_with_wilds() {
let content = r"
fn test_fn() {
match (false, true) {
(false, _) => {},
(true, false) => {},
(_, true) => {},
}
}
";
check_no_diagnostic(content);
}
#[test]
@ -727,7 +944,7 @@ mod tests {
}
";
check_diagnostic_with_no_fix(content);
check_diagnostic(content);
}
#[test]
@ -754,7 +971,7 @@ mod tests {
}
";
check_diagnostic_with_no_fix(content);
check_diagnostic(content);
}
#[test]
@ -767,7 +984,7 @@ mod tests {
}
";
check_diagnostic_with_no_fix(content);
check_diagnostic(content);
}
#[test]
@ -796,7 +1013,7 @@ mod tests {
}
";
check_diagnostic_with_no_fix(content);
check_diagnostic(content);
}
#[test]
@ -827,7 +1044,7 @@ mod tests {
}
";
check_diagnostic_with_no_fix(content);
check_diagnostic(content);
}
#[test]
@ -844,7 +1061,7 @@ mod tests {
}
";
check_diagnostic_with_no_fix(content);
check_diagnostic(content);
}
#[test]
@ -879,7 +1096,7 @@ mod tests {
}
";
check_diagnostic_with_no_fix(content);
check_diagnostic(content);
}
#[test]
@ -913,7 +1130,7 @@ mod tests {
}
";
check_diagnostic_with_no_fix(content);
check_diagnostic(content);
}
#[test]
@ -931,7 +1148,7 @@ mod tests {
}
";
check_diagnostic_with_no_fix(content);
check_diagnostic(content);
}
#[test]
@ -1004,7 +1221,7 @@ mod tests {
}
";
check_diagnostic_with_no_fix(content);
check_diagnostic(content);
}
#[test]
@ -1089,7 +1306,7 @@ mod tests {
";
// Match arms with the incorrect type are filtered out.
check_diagnostic_with_no_fix(content);
check_diagnostic(content);
}
#[test]
@ -1104,7 +1321,23 @@ mod tests {
";
// Match arms with the incorrect type are filtered out.
check_diagnostic_with_no_fix(content);
check_diagnostic(content);
}
#[test]
fn enum_not_in_scope() {
let content = r"
fn test_fn() {
match Foo::Bar {
Foo::Baz => (),
}
}
";
// The enum is not in scope so we don't perform exhaustiveness
// checking, but we want to be sure we don't panic here (and
// we don't create a diagnostic).
check_no_diagnostic(content);
}
}
@ -1158,17 +1391,21 @@ mod false_negatives {
}
#[test]
fn enum_not_in_scope() {
fn internal_or() {
let content = r"
fn test_fn() {
match Foo::Bar {
Foo::Baz => (),
enum Either {
A(bool),
B,
}
match Either::B {
Either::A(true | false) => (),
}
}
";
// This is a false negative.
// The enum is not in scope so we don't perform exhaustiveness checking.
// We do not currently handle patterns with internal `or`s.
check_no_diagnostic(content);
}
}