Chapter 17  Extended pattern-matching

Cristina Cornes and Hugo Herbelin

This section describes the full form of pattern-matching in Coq terms.

17.1  Patterns

The full syntax of match is presented in Figures 1.1 and 1.2. Identifiers in patterns are either constructor names or variables. Any identifier that is not the constructor of an inductive or co-inductive type is considered to be a variable. A variable name cannot occur more than once in a given pattern. It is recommended to start variable names by a lowercase letter.

If a pattern has the form (c x) where c is a constructor symbol and x is a linear vector of (distinct) variables, it is called simple: it is the kind of pattern recognized by the basic version of match. On the opposite, if it is a variable x or has the form (c p) with p not only made of variables, the pattern is called nested.

A variable pattern matches any value, and the identifier is bound to that value. The pattern “_” (called “don’t care” or “wildcard” symbol) also matches any value, but does not bind anything. It may occur an arbitrary number of times in a pattern. Alias patterns written (pattern as identifier) are also accepted. This pattern matches the same values as pattern does and identifier is bound to the matched value. A pattern of the form pattern|pattern is called disjunctive. A list of patterns separated with commas is also considered as a pattern and is called multiple pattern. However multiple patterns can only occur at the root of pattern-matching equations. Disjunctions of multiple pattern are allowed though.

Since extended match expressions are compiled into the primitive ones, the expressiveness of the theory remains the same. Once the stage of parsing has finished only simple patterns remain. Re-nesting of pattern is performed at printing time. An easy way to see the result of the expansion is to toggle off the nesting performed at printing (use here Set Printing Matching), then by printing the term with Print if the term is a constant, or using the command Check.

The extended match still accepts an optional elimination predicate given after the keyword return. Given a pattern matching expression, if all the right-hand-sides of => (rhs in short) have the same type, then this type can be sometimes synthesized, and so we can omit the return part. Otherwise the predicate after return has to be provided, like for the basic match.

Let us illustrate through examples the different aspects of extended pattern matching. Consider for example the function that computes the maximum of two natural numbers. We can write it in primitive syntax by:

Coq < Fixpoint max (n m:nat) {struct m} : nat :=
Coq <   match n with
Coq <   | O => m
Coq <   | S n’ => match m with
Coq <             | O => S n’
Coq <             | S m’ => S (max n’ m’)
Coq <             end
Coq <   end.
max is recursively defined (decreasing on 2nd argument)
Multiple patterns

Using multiple patterns in the definition of max allows to write:

Coq < Reset max.

Coq < Fixpoint max (n m:nat) {struct m} : nat :=
Coq <   match n, m with
Coq <   | O, _ => m
Coq <   | S n’, O => S n’
Coq <   | S n’, S m’ => S (max n’ m’)
Coq <   end.
max is recursively defined (decreasing on 2nd argument)

which will be compiled into the previous form.

The pattern-matching compilation strategy examines patterns from left to right. A match expression is generated only when there is at least one constructor in the column of patterns. E.g. the following example does not build a match expression.

Coq < Check (fun x:nat => match x return nat with
Coq <                     | y => y
Coq <                     end).
fun x : nat => x
     : nat -> nat
Aliasing subpatterns

We can also use “as ident” to associate a name to a sub-pattern:

Coq < Reset max.

Coq < Fixpoint max (n m:nat) {struct n} : nat :=
Coq <   match n, m with
Coq <   | O, _ => m
Coq <   | S n’ as p, O => p
Coq <   | S n’, S m’ => S (max n’ m’)
Coq <   end.
max is recursively defined (decreasing on 1st argument)
Nested patterns

Here is now an example of nested patterns:

Coq < Fixpoint even (n:nat) : bool :=
Coq <   match n with
Coq <   | O => true
Coq <   | S O => false
Coq <   | S (S n’) => even n’
Coq <   end.
even is recursively defined (decreasing on 1st argument)

This is compiled into:

Coq < Print even.
even = 
fix even (n : nat) : bool :=
  match n with
  | 0 => true
  | 1 => false
  | S (S n’) => even n’
  end
     : nat -> bool
Argument scope is [nat_scope]

In the previous examples patterns do not conflict with, but sometimes it is comfortable to write patterns that admit a non trivial superposition. Consider the boolean function lef that given two natural numbers yields true if the first one is less or equal than the second one and false otherwise. We can write it as follows:

Coq < Fixpoint lef (n m:nat) {struct m} : bool :=
Coq <   match n, m with
Coq <   | O, x => true
Coq <   | x, O => false
Coq <   | S n, S m => lef n m
Coq <   end.
lef is recursively defined (decreasing on 2nd argument)

Note that the first and the second multiple pattern superpose because the couple of values O O matches both. Thus, what is the result of the function on those values? To eliminate ambiguity we use the textual priority rule: we consider patterns ordered from top to bottom, then a value is matched by the pattern at the ith row if and only if it is not matched by some pattern of a previous row. Thus in the example, O O is matched by the first pattern, and so (lef O O) yields true.

Another way to write this function is:

Coq < Reset lef.

Coq < Fixpoint lef (n m:nat) {struct m} : bool :=
Coq <   match n, m with
Coq <   | O, x => true
Coq <   | S n, S m => lef n m
Coq <   | _, _ => false
Coq <   end.
lef is recursively defined (decreasing on 2nd argument)

Here the last pattern superposes with the first two. Because of the priority rule, the last pattern will be used only for values that do not match neither the first nor the second one.

Terms with useless patterns are not accepted by the system. Here is an example:

Coq < Check (fun x:nat =>
Coq <          match x with
Coq <          | O => true
Coq <          | S _ => false
Coq <          | x => true
Coq <          end).
Coq < Coq < Toplevel input, characters 246-255:
>          | x => true
>            ^^^^^^^^^
Error: This clause is redundant.
Disjunctive patterns

Multiple patterns that share the same right-hand-side can be factorized using the notation mult_pattern | … | mult_pattern. For instance, max can be rewritten as follows:

Coq < Fixpoint max (n m:nat) {struct m} : nat :=
Coq <   match n, m with
Coq <   | S n’, S m’ => S (max n’ m’)
Coq <   | 0, p | p, 0 => p
Coq <   end.
max is recursively defined (decreasing on 2nd argument)

Similarly, factorization of (non necessary multiple) patterns that share the same variables is possible by using the notation pattern | … | pattern. Here is an example:

Coq < Definition filter_2_4 (n:nat) : nat :=
Coq <   match n with
Coq <   | 2 as m | 4 as m => m
Coq <   | _ => 0
Coq <   end.
filter_2_4 is defined

Here is another example using disjunctive subpatterns.

Coq < Definition filter_some_square_corners (p:nat*nat) : nat*nat :=
Coq <   match p with
Coq <   | ((2 as m | 4 as m), (3 as n | 5 as n)) => (m,n)
Coq <   | _ => (0,0)
Coq <   end.
filter_some_square_corners is defined

17.2  About patterns of parametric types

When matching objects of a parametric type, constructors in patterns do not expect the parameter arguments. Their value is deduced during expansion. Consider for example the type of polymorphic lists:

Coq < Inductive List (A:Set) : Set :=
Coq <   | nil : List A
Coq <   | cons : A -> List A -> List A.
List is defined
List_rect is defined
List_ind is defined
List_rec is defined

We can check the function tail:

Coq < Check
Coq <   (fun l:List nat =>
Coq <      match l with
Coq <      | nil => nil nat
Coq <      | cons _ l’ => l’
Coq <      end).
fun l : List nat => match l with
                    | nil => nil nat
                    | cons _ l’ => l’
                    end
     : List nat -> List nat

When we use parameters in patterns there is an error message:

Coq < Check
Coq <   (fun l:List nat =>
Coq <      match l with
Coq <      | nil A => nil nat
Coq <      | cons A _ l’ => l’
Coq <      end).
Coq < Coq < Toplevel input, characters 196-201:
>      | nil A => nil nat
>        ^^^^^
Error: The constructor nil expects no arguments.

17.3  Matching objects of dependent types

The previous examples illustrate pattern matching on objects of non-dependent types, but we can also use the expansion strategy to destructure objects of dependent type. Consider the type listn of lists of a certain length:

Coq < Inductive listn : nat -> Set :=
Coq <   | niln : listn 0
Coq <   | consn : forall n:nat, nat -> listn n -> listn (S n).
listn is defined
listn_rect is defined
listn_ind is defined
listn_rec is defined

17.3.1  Understanding dependencies in patterns

We can define the function length over listn by:

Coq < Definition length (n:nat) (l:listn n) := n.
length is defined

Just for illustrating pattern matching, we can define it by case analysis:

Coq < Reset length.

Coq < Definition length (n:nat) (l:listn n) :=
Coq <   match l with
Coq <   | niln => 0
Coq <   | consn n _ _ => S n
Coq <   end.
length is defined

We can understand the meaning of this definition using the same notions of usual pattern matching.

17.3.2  When the elimination predicate must be provided

The examples given so far do not need an explicit elimination predicate because all the rhs have the same type and the strategy succeeds to synthesize it. Unfortunately when dealing with dependent patterns it often happens that we need to write cases where the type of the rhs are different instances of the elimination predicate. The function concat for listn is an example where the branches have different type and we need to provide the elimination predicate:

Coq < Fixpoint concat (n:nat) (l:listn n) (m:nat) (l’:listn m) {struct l} :
Coq <  listn (n + m) :=
Coq <   match l in listn n return listn (n + m) with
Coq <   | niln => l’
Coq <   | consn n’ a y => consn (n’ + m) a (concat n’ y m l’)
Coq <   end.
concat is recursively defined (decreasing on 2nd argument)

The elimination predicate is fun (n:nat) (l:listn n) => listn (n+m). In general if m has type (I q1qr t1ts) where q1qr are parameters, the elimination predicate should be of the form : fun y1ys x:(I q1qr y1ys) => Q.

In the concrete syntax, it should be written :

match m as x in (I _… _ y1… ysreturn Q with … end

The variables which appear in the in and as clause are new and bounded in the property Q in the return clause. The parameters of the inductive definitions should not be mentioned and are replaced by _.

Recall that a list of patterns is also a pattern. So, when we destructure several terms at the same time and the branches have different type we need to provide the elimination predicate for this multiple pattern. It is done using the same scheme, each term may be associated to an as and in clause in order to introduce a dependent product.

For example, an equivalent definition for concat (even though the matching on the second term is trivial) would have been:

Coq < Reset concat.

Coq < Fixpoint concat (n:nat) (l:listn n) (m:nat) (l’:listn m) {struct l} :
Coq <  listn (n + m) :=
Coq <   match l in listn n, l’ return listn (n + m) with
Coq <   | niln, x => x
Coq <   | consn n’ a y, x => consn (n’ + m) a (concat n’ y m x)
Coq <   end.
concat is recursively defined (decreasing on 2nd argument)

When the arity of the predicate (i.e. number of abstractions) is not correct Coq raises an error message. For example:

Coq < Fixpoint concat
Coq <  (n:nat) (l:listn n) (m:nat)
Coq <  (l’:listn m) {struct l} : listn (n + m) :=
Coq <   match l, l’ with
Coq <   | niln, x => x
Coq <   | consn n’ a y, x => consn (n’ + m) a (concat n’ y m x)
Coq <   end.
Coq < Coq < Coq < Toplevel input, characters 342-343:
>   | niln, x => x
>                ^
Error:
In environment
concat : forall n : nat,
         listn n -> forall m : nat, listn m -> listn (n + m)
n : nat
l : listn n
m : nat
l’ : listn m
The term "l’" has type "listn m" while it is expected to have type
 "listn (?47 + ?48)".

17.4  Using pattern matching to write proofs

In all the previous examples the elimination predicate does not depend on the object(s) matched. But it may depend and the typical case is when we write a proof by induction or a function that yields an object of dependent type. An example of proof using match in given in Section 8.2.3.

For example, we can write the function buildlist that given a natural number n builds a list of length n containing zeros as follows:

Coq < Fixpoint buildlist (n:nat) : listn n :=
Coq <   match n return listn n with
Coq <   | O => niln
Coq <   | S n => consn n 0 (buildlist n)
Coq <   end.
buildlist is recursively defined (decreasing on 1st argument)

We can also use multiple patterns. Consider the following definition of the predicate less-equal Le:

Coq < Inductive LE : nat -> nat -> Prop :=
Coq <   | LEO : forall n:nat, LE 0 n
Coq <   | LES : forall n m:nat, LE n m -> LE (S n) (S m).
LE is defined
LE_ind is defined

We can use multiple patterns to write the proof of the lemma forall (n m:nat), (LE n m)\/(LE m n):

Coq < Fixpoint dec (n m:nat) {struct n} : LE n m \/ LE m n :=
Coq <   match n, m return LE n m \/ LE m n with
Coq <   | O, x => or_introl (LE x 0) (LEO x)
Coq <   | x, O => or_intror (LE x 0) (LEO x)
Coq <   | S n as n’, S m as m’ =>
Coq <       match dec n m with
Coq <       | or_introl h => or_introl (LE m’ n’) (LES n m h)
Coq <       | or_intror h => or_intror (LE n’ m’) (LES m n h)
Coq <       end
Coq <   end.
dec is recursively defined (decreasing on 1st argument)

In the example of dec, the first match is dependent while the second is not.

The user can also use match in combination with the tactic refine (see Section 8.2.3) to build incomplete proofs beginning with a match construction.

17.5  Pattern-matching on inductive objects involving local definitions

If local definitions occur in the type of a constructor, then there are two ways to match on this constructor. Either the local definitions are skipped and matching is done only on the true arguments of the constructors, or the bindings for local definitions can also be caught in the matching.

Example.

Coq < Inductive list : nat -> Set :=
Coq <   | nil : list 0
Coq <   | cons : forall n:nat, let m := (2 * n) in list m -> list (S (S m)).

In the next example, the local definition is not caught.

Coq < Fixpoint length n (l:list n) {struct l} : nat :=
Coq <   match l with
Coq <   | nil => 0
Coq <   | cons n l0 => S (length (2 * n) l0)
Coq <   end.
length is recursively defined (decreasing on 2nd argument)

But in this example, it is.

Coq < Fixpoint length’ n (l:list n) {struct l} : nat :=
Coq <   match l with
Coq <   | nil => 0
Coq <   | cons _ m l0 => S (length’ m l0)
Coq <   end.
length’ is recursively defined (decreasing on 2nd argument)


Remark: for a given matching clause, either none of the local definitions or all of them can be caught.

17.6  Pattern-matching and coercions

If a mismatch occurs between the expected type of a pattern and its actual type, a coercion made from constructors is sought. If such a coercion can be found, it is automatically inserted around the pattern.

Example:

Coq < Inductive I : Set :=
Coq <   | C1 : nat -> I
Coq <   | C2 : I -> I.
I is defined
I_rect is defined
I_ind is defined
I_rec is defined

Coq < Coercion C1 : nat >-> I.
C1 is now a coercion

Coq < Check (fun x => match x with
Coq <                 | C2 O => 0
Coq <                 | _ => 0
Coq <                 end).
fun x : I =>
match x with
| C1 _ => 0
| C2 (C1 0) => 0
| C2 (C1 (S _)) => 0
| C2 (C2 _) => 0
end
     : I -> nat

17.7  When does the expansion strategy fail ?

The strategy works very like in ML languages when treating patterns of non-dependent type. But there are new cases of failure that are due to the presence of dependencies.

The error messages of the current implementation may be sometimes confusing. When the tactic fails because patterns are somehow incorrect then error messages refer to the initial expression. But the strategy may succeed to build an expression whose sub-expressions are well typed when the whole expression is not. In this situation the message makes reference to the expanded expression. We encourage users, when they have patterns with the same outer constructor in different equations, to name the variable patterns in the same positions with the same name. E.g. to write (cons n O x) => e1 and (cons n _ x) => e2 instead of (cons n O x) => e1 and (cons n’ _ x’) => e2. This helps to maintain certain name correspondence between the generated expression and the original.

Here is a summary of the error messages corresponding to each situation:


Error messages:

  1. The constructor ident expects num arguments

    The variable ident is bound several times in pattern term

    Found a constructor of inductive type term while a constructor of term is expected

    Patterns are incorrect (because constructors are not applied to the correct number of the arguments, because they are not linear or they are wrongly typed).

  2. Non exhaustive pattern-matching

    The pattern matching is not exhaustive.

  3. The elimination predicate term should be of arity num (for non dependent case) or num (for dependent case)

    The elimination predicate provided to match has not the expected arity.

  4. Unable to infer a match predicate
    Either there is a type incompatiblity or the problem involves
    dependencies

    There is a type mismatch between the different branches. The user should provide an elimination predicate.