Library InductiveTypes
Proof Terms
: True -> True
The identity program is interpreted as a proof that True, the always-true proposition, implies itself! What we see is that Curry-Howard interprets implications as functions, where an input is a proposition being assumed and an output is a proposition being deduced. This intuition is not too far from a common one for informal theorem proving, where we might already think of an implication proof as a process for transforming a hypothesis into a conclusion.
There are also more primitive proof forms available. For instance, the term I is the single proof of True, applicable in any context.
Check I.
: True
With I, we can prove another simple propositional theorem.
Check (fun _ : False => I).
: False -> True
No proofs of False exist in the top-level context, but the implication-as-function analogy gives us an easy way to, for example, show that False implies itself.
: False -> False
Every one of these example programs whose type looks like a logical formula is a proof term. We use that name for any Gallina term of a logical type, and we will elaborate shortly on what makes a type logical.
In the rest of this chapter, we will introduce different ways of defining types. Every example type can be interpreted alternatively as a type of programs or proofs.
One of the first types we introduce will be bool, with constructors true and false. Newcomers to Coq often wonder about the distinction between True and true and the distinction between False and false. One glib answer is that True and False are types, but true and false are not. A more useful answer is that Coq's metatheory guarantees that any term of type bool evaluates to either true or false. This means that we have an algorithm for answering any question phrased as an expression of type bool. Conversely, most propositions do not evaluate to True or False; the language of inductively defined propositions is much richer than that. We ought to be glad that we have no algorithm for deciding our formalized version of mathematical truth, since otherwise it would be clear that we could not formalize undecidable properties, like almost any interesting property of general-purpose programs.
Coq inductive types generalize the algebraic datatypes found in Haskell and ML. Confusingly enough, inductive types also generalize generalized algebraic datatypes (GADTs), by adding the possibility for type dependency. Even so, it is worth backing up from the examples of the last chapter and going over basic, algebraic-datatype uses of inductive datatypes, because the chance to prove things about the values of these types adds new wrinkles beyond usual practice in Haskell and ML.
The singleton type unit is an inductive type:
Enumerations
This vernacular command defines a new inductive type unit whose only value is tt. We can verify the types of the two identifiers we introduce:
The important thing about an inductive type is, unsurprisingly, that you can do induction over its values, and induction is the key to proving this theorem. We ask to proceed by induction on the variable x.
induction x.
reflexivity.
Qed.
It seems kind of odd to write a proof by induction with no inductive hypotheses. We could have arrived at the same result by beginning the proof with:
destruct x.
...which corresponds to "proof by case analysis" in classical math. For non-recursive inductive types, the two tactics will always have identical behavior. Often case analysis is sufficient, even in proofs about recursive types, and it is nice to avoid introducing unneeded induction hypotheses.
What exactly is the induction principle for unit? We can ask Coq:
destruct x.
unit_ind : forall P : unit -> Prop, P tt -> forall u : unit, P u
Every Inductive command defining a type T also defines an induction principle named T_ind. Recall from the last section that our type, operations over it, and principles for reasoning about it all live in the same language and are described by the same type system. The key to telling what is a program and what is a proof lies in the distinction between the type Prop, which appears in our induction principle; and the type Set, which we have seen a few times already.
The convention goes like this: Set is the type of normal types used in programming, and the values of such types are programs. Prop is the type of logical propositions, and the values of such types are proofs. Thus, an induction principle has a type that shows us that it is a function for building proofs.
Specifically, unit_ind quantifies over a predicate P over unit values. If we can present a proof that P holds of tt, then we are rewarded with a proof that P holds for any value u of type unit. In our last proof, the predicate was (fun u : unit => u = tt).
The definition of unit places the type in Set. By replacing Set with Prop, unit with True, and tt with I, we arrive at precisely the definition of True that the Coq standard library employs! The program type unit is the Curry-Howard equivalent of the proposition True. We might make the tongue-in-cheek claim that, while philosophers have expended much ink on the nature of truth, we have now determined that truth is the unit type of functional programming.
We can define an inductive type even simpler than unit:
Empty_set has no elements. We can prove fun theorems about it:
Because Empty_set has no elements, the fact of having an element of this type implies anything. We use destruct 1 instead of destruct x in the proof because unused quantified variables are relegated to being referred to by number. (There is a good reason for this, related to the unity of quantifiers and implication. At least within Coq's logical foundation of constructive logic, which we elaborate on more in the next chapter, an implication is just a quantification over a proof, where the quantified variable is never used. It generally makes more sense to refer to implication hypotheses by number than by name, and Coq treats our quantifier over an unused variable as an implication in determining the proper behavior.)
We can see the induction principle that made this proof so easy:
Empty_set_ind : forall (P : Empty_set -> Prop) (e : Empty_set), P e
In other words, any predicate over values from the empty set holds vacuously of every such element. In the last proof, we chose the predicate (fun _ : Empty_set => 2 + 2 = 5).
We can also apply this get-out-of-jail-free card programmatically. Here is a lazy way of converting values of Empty_set to values of unit:
We employ match pattern matching as in the last chapter. Since we match on a value whose type has no constructors, there is no need to provide any branches. It turns out that Empty_set is the Curry-Howard equivalent of False. As for why Empty_set starts with a capital letter and not a lowercase letter like unit does, we must refer the reader to the authors of the Coq standard library, to which we try to be faithful.
Moving up the ladder of complexity, we can define the Booleans:
We can use less vacuous pattern matching to define Boolean negation.
An alternative definition desugars to the above, thanks to an if notation overloaded to work with any inductive type that has exactly two constructors:
We might want to prove that negb is its own inverse operation.
After we case-analyze on b, we are left with one subgoal for each constructor of bool.
2 subgoals
============================
negb (negb true) = true
subgoal 2 is
negb (negb false) = false
The first subgoal follows by Coq's rules of computation, so we can dispatch it easily:
2 subgoals
============================
negb (negb true) = true
subgoal 2 is
negb (negb false) = false
reflexivity.
Likewise for the second subgoal, so we can restart the proof and give a very compact justification.
Restart.
destruct b; reflexivity.
Qed.
Another theorem about Booleans illustrates another useful tactic.
The discriminate tactic is used to prove that two values of an inductive type are not equal, whenever the values are formed with different constructors. In this case, the different constructors are true and false.
At this point, it is probably not hard to guess what the underlying induction principle for bool is.
bool_ind : forall P : bool -> Prop, P true -> P false -> forall b : bool, P b
That is, to prove that a property describes all bools, prove that it describes both true and false.
There is no interesting Curry-Howard analogue of bool. Of course, we can define such a type by replacing Set by Prop above, but the proposition we arrive at is not very useful. It is logically equivalent to True, but it provides two indistinguishable primitive proofs, true and false. In the rest of the chapter, we will skip commenting on Curry-Howard versions of inductive definitions where such versions are not interesting.
The natural numbers are the simplest common example of an inductive type that actually deserves the name.
Simple Recursive Types
The constructor O is zero, and S is the successor function, so that 0 is syntactic sugar for O, 1 for S O, 2 for S (S O), and so on.
Pattern matching works as we demonstrated in the last chapter:
Definition isZero (n : nat) : bool :=
match n with
| O => true
| S _ => false
end.
Definition pred (n : nat) : nat :=
match n with
| O => O
| S n' => n'
end.
We can prove theorems by case analysis with destruct as for simpler inductive types, but we can also now get into genuine inductive theorems. First, we will need a recursive function, to make things interesting.
Recall that Fixpoint is Coq's mechanism for recursive function definitions. Some theorems about plus can be proved without induction.
Coq's computation rules automatically simplify the application of plus, because unfolding the definition of plus gives us a match expression where the branch to be taken is obvious from syntax alone. If we just reverse the order of the arguments, though, this no longer works, and we need induction.
reflexivity.
Our second subgoal requires more work and also demonstrates our first inductive hypothesis.
n : nat
IHn : plus n O = n
============================
plus (S n) O = S n
We can start out by using computation to simplify the goal as far as we can.
n : nat
IHn : plus n O = n
============================
plus (S n) O = S n
simpl.
rewrite IHn.
reflexivity.
Not much really went on in this proof, so the crush tactic from the CpdtTactics module can prove this theorem automatically.
Restart.
induction n; crush.
Qed.
We can check out the induction principle at work here:
nat_ind : forall P : nat -> Prop,
P O -> (forall n : nat, P n -> P (S n)) -> forall n : nat, P n
The injection tactic refers to a premise by number, adding new equalities between the corresponding arguments of equated terms that are formed with the same constructor. We end up needing to prove n = m -> n = m, so it is unsurprising that a tactic named trivial is able to finish the proof. This tactic attempts a variety of single proof steps, drawn from a user-specified database that we will later see how to extend.
There is also a very useful tactic called congruence that can prove this theorem immediately. The congruence tactic generalizes discriminate and injection, and it also adds reasoning about the general properties of equality, such as that a function returns equal results on equal arguments. That is, congruence is a complete decision procedure for the theory of equality and uninterpreted functions, plus some smarts about inductive types.
We can define a type of lists of natural numbers.
Recursive definitions over nat_list are straightforward extensions of what we have seen before.
Fixpoint nlength (ls : nat_list) : nat :=
match ls with
| NNil => O
| NCons _ ls' => S (nlength ls')
end.
Fixpoint napp (ls1 ls2 : nat_list) : nat_list :=
match ls1 with
| NNil => ls2
| NCons n ls1' => NCons n (napp ls1' ls2)
end.
Inductive theorem proving can again be automated quite effectively.
Theorem nlength_napp : forall ls1 ls2 : nat_list, nlength (napp ls1 ls2)
= plus (nlength ls1) (nlength ls2).
induction ls1; crush.
Qed.
Check nat_list_ind.
nat_list_ind
: forall P : nat_list -> Prop,
P NNil ->
(forall (n : nat) (n0 : nat_list), P n0 -> P (NCons n n0)) ->
forall n : nat_list, P n
Inductive nat_btree : Set :=
| NLeaf : nat_btree
| NNode : nat_btree -> nat -> nat_btree -> nat_btree.
Here are two functions whose intuitive explanations are not so important. The first one computes the size of a tree, and the second performs some sort of splicing of one tree into the leftmost available leaf node of another.
Fixpoint nsize (tr : nat_btree) : nat :=
match tr with
| NLeaf => S O
| NNode tr1 _ tr2 => plus (nsize tr1) (nsize tr2)
end.
Fixpoint nsplice (tr1 tr2 : nat_btree) : nat_btree :=
match tr1 with
| NLeaf => NNode tr2 O NLeaf
| NNode tr1' n tr2' => NNode (nsplice tr1' tr2) n tr2'
end.
Theorem plus_assoc : forall n1 n2 n3 : nat, plus (plus n1 n2) n3 = plus n1 (plus n2 n3).
induction n1; crush.
Qed.
Theorem nsize_nsplice : forall tr1 tr2 : nat_btree, nsize (nsplice tr1 tr2)
= plus (nsize tr2) (nsize tr1).
Hint Rewrite n_plus_O plus_assoc.
induction tr1; crush.
Qed.
It is convenient that these proofs go through so easily, but it is still useful to look into the details of what happened, by checking the statement of the tree induction principle.
nat_btree_ind
: forall P : nat_btree -> Prop,
P NLeaf ->
(forall n : nat_btree,
P n -> forall (n0 : nat) (n1 : nat_btree), P n1 -> P (NNode n n0 n1)) ->
forall n : nat_btree, P n
Parameterized Types
Inductive list (T : Set) : Set :=
| Nil : list T
| Cons : T -> list T -> list T.
Fixpoint length T (ls : list T) : nat :=
match ls with
| Nil => O
| Cons _ ls' => S (length ls')
end.
Fixpoint app T (ls1 ls2 : list T) : list T :=
match ls1 with
| Nil => ls2
| Cons x ls1' => Cons x (app ls1' ls2)
end.
Theorem length_app : forall T (ls1 ls2 : list T), length (app ls1 ls2)
= plus (length ls1) (length ls2).
induction ls1; crush.
Qed.
There is a useful shorthand for writing many definitions that share the same parameter, based on Coq's section mechanism. The following block of code is equivalent to the above:
Section list.
Variable T : Set.
Inductive list : Set :=
| Nil : list
| Cons : T -> list -> list.
Fixpoint length (ls : list) : nat :=
match ls with
| Nil => O
| Cons _ ls' => S (length ls')
end.
Fixpoint app (ls1 ls2 : list) : list :=
match ls1 with
| Nil => ls2
| Cons x ls1' => Cons x (app ls1' ls2)
end.
Theorem length_app : forall ls1 ls2 : list, length (app ls1 ls2)
= plus (length ls1) (length ls2).
induction ls1; crush.
Qed.
End list.
Implicit Arguments Nil [T].
After we end the section, the Variables we used are added as extra function parameters for each defined identifier, as needed. With an Implicit Arguments command, we ask that T be inferred when we use Nil; Coq's heuristics already decided to apply a similar policy to Cons, because of the Set Implicit Arguments command elided at the beginning of this chapter. We verify that our definitions have been saved properly using the Print command, a cousin of Check which shows the definition of a symbol, rather than just its type.
Inductive list (T : Set) : Set :=
Nil : list T | Cons : T -> list T -> list T
length
: forall T : Set, list T -> nat
list_ind
: forall (T : Set) (P : list T -> Prop),
P (Nil T) ->
(forall (t : T) (l : list T), P l -> P (Cons t l)) ->
forall l : list T, P l
Mutually Inductive Types
Inductive even_list : Set :=
| ENil : even_list
| ECons : nat -> odd_list -> even_list
with odd_list : Set :=
| OCons : nat -> even_list -> odd_list.
Fixpoint elength (el : even_list) : nat :=
match el with
| ENil => O
| ECons _ ol => S (olength ol)
end
with olength (ol : odd_list) : nat :=
match ol with
| OCons _ el => S (elength el)
end.
Fixpoint eapp (el1 el2 : even_list) : even_list :=
match el1 with
| ENil => el2
| ECons n ol => ECons n (oapp ol el2)
end
with oapp (ol : odd_list) (el : even_list) : odd_list :=
match ol with
| OCons n el' => OCons n (eapp el' el)
end.
Everything is going roughly the same as in past examples, until we try to prove a theorem similar to those that came before.
Theorem elength_eapp : forall el1 el2 : even_list,
elength (eapp el1 el2) = plus (elength el1) (elength el2).
induction el1; crush.
One goal remains:
n : nat
o : odd_list
el2 : even_list
============================
S (olength (oapp o el2)) = S (plus (olength o) (elength el2))
We have no induction hypothesis, so we cannot prove this goal without starting another induction, which would reach a similar point, sending us into a futile infinite chain of inductions. The problem is that Coq's generation of T_ind principles is incomplete. We only get non-mutual induction principles generated by default.
n : nat
o : odd_list
el2 : even_list
============================
S (olength (oapp o el2)) = S (plus (olength o) (elength el2))
even_list_ind
: forall P : even_list -> Prop,
P ENil ->
(forall (n : nat) (o : odd_list), P (ECons n o)) ->
forall e : even_list, P e
Scheme even_list_mut := Induction for even_list Sort Prop
with odd_list_mut := Induction for odd_list Sort Prop.
This invocation of Scheme asks for the creation of induction principles even_list_mut for the type even_list and odd_list_mut for the type odd_list. The Induction keyword says we want standard induction schemes, since Scheme supports more exotic choices. Finally, Sort Prop establishes that we really want induction schemes, not recursion schemes, which are the same according to Curry-Howard, save for the Prop/Set distinction.
even_list_mut
: forall (P : even_list -> Prop) (P0 : odd_list -> Prop),
P ENil ->
(forall (n : nat) (o : odd_list), P0 o -> P (ECons n o)) ->
(forall (n : nat) (e : even_list), P e -> P0 (OCons n e)) ->
forall e : even_list, P e
Here we use apply, which is one of the most essential basic tactics. When we are trying to prove fact P, and when thm is a theorem whose conclusion can be made to match P by proper choice of quantified variable values, the invocation apply thm will replace the current goal with one new goal for each premise of thm.
This use of apply may seem a bit too magical. To better see what is going on, we use a variant where we partially apply the theorem nat_ind to give an explicit value for the predicate that gives our induction hypothesis.
From this example, we can see that induction is not magic. It only does some bookkeeping for us to make it easy to apply a theorem, which we can do directly with the apply tactic.
This technique generalizes to our mutual example:
Theorem elength_eapp : forall el1 el2 : even_list,
elength (eapp el1 el2) = plus (elength el1) (elength el2).
apply (even_list_mut
(fun el1 : even_list => forall el2 : even_list,
elength (eapp el1 el2) = plus (elength el1) (elength el2))
(fun ol : odd_list => forall el : even_list,
olength (oapp ol el) = plus (olength ol) (elength el))); crush.
Qed.
We simply need to specify two predicates, one for each of the mutually inductive types. In general, it is not a good idea to assume that a proof assistant can infer extra predicates, so this way of applying mutual induction is about as straightforward as we may hope for.
A kind of inductive type called a reflexive type includes at least one constructor that takes as an argument a function returning the same type we are defining. One very useful class of examples is in modeling variable binders. Our example will be an encoding of the syntax of first-order logic. Since the idea of syntactic encodings of logic may require a bit of acclimation, let us first consider a simpler formula type for a subset of propositional logic. We are not yet using a reflexive type, but later we will extend the example reflexively.
Reflexive Types
Inductive pformula : Set :=
| Truth : pformula
| Falsehood : pformula
| Conjunction : pformula -> pformula -> pformula.
A key distinction here is between, for instance, the syntax Truth and its semantics True. We can make the semantics explicit with a recursive function. This function uses the infix operator /\, which desugars to instances of the type family and from the standard library. The family and implements conjunction, the Prop Curry-Howard analogue of the usual pair type from functional programming (which is the type family prod in Coq's standard library).
Fixpoint pformulaDenote (f : pformula) : Prop :=
match f with
| Truth => True
| Falsehood => False
| Conjunction f1 f2 => pformulaDenote f1 /\ pformulaDenote f2
end.
This is just a warm-up that does not use reflexive types, the new feature we mean to introduce. When we set our sights on first-order logic instead, it becomes very handy to give constructors recursive arguments that are functions.
Inductive formula : Set :=
| Eq : nat -> nat -> formula
| And : formula -> formula -> formula
| Forall : (nat -> formula) -> formula.
Our kinds of formulas are equalities between naturals, conjunction, and universal quantification over natural numbers. We avoid needing to include a notion of "variables" in our type, by using Coq functions to encode the syntax of quantification. For instance, here is the encoding of forall x : nat, x = x:
We can write recursive functions over reflexive types quite naturally. Here is one translating our formulas into native Coq propositions.
Fixpoint formulaDenote (f : formula) : Prop :=
match f with
| Eq n1 n2 => n1 = n2
| And f1 f2 => formulaDenote f1 /\ formulaDenote f2
| Forall f' => forall n : nat, formulaDenote (f' n)
end.
We can also encode a trivial formula transformation that swaps the order of equality and conjunction operands.
Fixpoint swapper (f : formula) : formula :=
match f with
| Eq n1 n2 => Eq n2 n1
| And f1 f2 => And (swapper f2) (swapper f1)
| Forall f' => Forall (fun n => swapper (f' n))
end.
It is helpful to prove that this transformation does not make true formulas false.
Theorem swapper_preserves_truth : forall f, formulaDenote f -> formulaDenote (swapper f).
induction f; crush.
Qed.
We can take a look at the induction principle behind this proof.
formula_ind
: forall P : formula -> Prop,
(forall n n0 : nat, P (Eq n n0)) ->
(forall f0 : formula,
P f0 -> forall f1 : formula, P f1 -> P (And f0 f1)) ->
(forall f1 : nat -> formula,
(forall n : nat, P (f1 n)) -> P (Forall f1)) ->
forall f2 : formula, P f2
Inductive term : Set :=
| App : term -> term -> term
| Abs : (term -> term) -> term.
Error: Non strictly positive occurrence of "term" in "(term -> term) -> term"
Definition uhoh (t : term) : term :=
match t with
| Abs f => f t
| _ => t
end.
An Interlude on Induction Principles
nat_ind =
fun P : nat -> Prop => nat_rect P
: forall P : nat -> Prop,
P O -> (forall n : nat, P n -> P (S n)) -> forall n : nat, P n
nat_rect
: forall P : nat -> Type,
P O -> (forall n : nat, P n -> P (S n)) -> forall n : nat, P n
nat_rec =
fun P : nat -> Set => nat_rect P
: forall P : nat -> Set,
P O -> (forall n : nat, P n -> P (S n)) -> forall n : nat, P n
Fixpoint plus_recursive (n : nat) : nat -> nat :=
match n with
| O => fun m => m
| S n' => fun m => S (plus_recursive n' m)
end.
Definition plus_rec : nat -> nat -> nat :=
nat_rec (fun _ : nat => nat -> nat) (fun m => m) (fun _ r m => S (r m)).
Theorem plus_equivalent : plus_recursive = plus_rec.
reflexivity.
Qed.
Going even further down the rabbit hole, nat_rect itself is not even a primitive. It is a functional program that we can write manually.
nat_rect =
fun (P : nat -> Type) (f : P O) (f0 : forall n : nat, P n -> P (S n)) =>
fix F (n : nat) : P n :=
match n as n0 return (P n0) with
| O => f
| S n0 => f0 n0 (F n0)
end
: forall P : nat -> Type,
P O -> (forall n : nat, P n -> P (S n)) -> forall n : nat, P n
Fixpoint nat_rect' (P : nat -> Type)
(HO : P O)
(HS : forall n, P n -> P (S n)) (n : nat) :=
match n return P n with
| O => HO
| S n' => HS n' (nat_rect' P HO HS n')
end.
First, we have the property of natural numbers that we aim to prove.
Then we require a proof of the O case, which we declare with the command Hypothesis, which is a synonym for Variable that, by convention, is used for variables whose types are propositions.
Next is a proof of the S case, which may assume an inductive hypothesis.
Finally, we define a recursive function to tie the pieces together.
Fixpoint nat_ind' (n : nat) : P n :=
match n with
| O => O_case
| S n' => S_case (nat_ind' n')
end.
End nat_ind'.
Closing the section adds the Variables and Hypothesises as new fun-bound arguments to nat_ind', and, modulo the use of Prop instead of Type, we end up with the exact same definition that was generated automatically for nat_rect.
We can also examine the definition of even_list_mut, which we generated with Scheme for a mutually recursive type.
even_list_mut =
fun (P : even_list -> Prop) (P0 : odd_list -> Prop)
(f : P ENil) (f0 : forall (n : nat) (o : odd_list), P0 o -> P (ECons n o))
(f1 : forall (n : nat) (e : even_list), P e -> P0 (OCons n e)) =>
fix F (e : even_list) : P e :=
match e as e0 return (P e0) with
| ENil => f
| ECons n o => f0 n o (F0 o)
end
with F0 (o : odd_list) : P0 o :=
match o as o0 return (P0 o0) with
| OCons n e => f1 n e (F e)
end
for F
: forall (P : even_list -> Prop) (P0 : odd_list -> Prop),
P ENil ->
(forall (n : nat) (o : odd_list), P0 o -> P (ECons n o)) ->
(forall (n : nat) (e : even_list), P e -> P0 (OCons n e)) ->
forall e : even_list, P e
First, we need the properties that we are proving.
Next, we need proofs of the three cases.
Hypothesis ENil_case : Peven ENil.
Hypothesis ECons_case : forall (n : nat) (o : odd_list), Podd o -> Peven (ECons n o).
Hypothesis OCons_case : forall (n : nat) (e : even_list), Peven e -> Podd (OCons n e).
Finally, we define the recursive functions.
Fixpoint even_list_mut' (e : even_list) : Peven e :=
match e with
| ENil => ENil_case
| ECons n o => ECons_case n (odd_list_mut' o)
end
with odd_list_mut' (o : odd_list) : Podd o :=
match o with
| OCons n e => OCons_case n (even_list_mut' e)
end.
End even_list_mut'.
Even induction principles for reflexive types are easy to implement directly. For our formula type, we can use a recursive definition much like those we wrote above.
Section formula_ind'.
Variable P : formula -> Prop.
Hypothesis Eq_case : forall n1 n2 : nat, P (Eq n1 n2).
Hypothesis And_case : forall f1 f2 : formula,
P f1 -> P f2 -> P (And f1 f2).
Hypothesis Forall_case : forall f : nat -> formula,
(forall n : nat, P (f n)) -> P (Forall f).
Fixpoint formula_ind' (f : formula) : P f :=
match f with
| Eq n1 n2 => Eq_case n1 n2
| And f1 f2 => And_case (formula_ind' f1) (formula_ind' f2)
| Forall f' => Forall_case f' (fun n => formula_ind' (f' n))
end.
End formula_ind'.
It is apparent that induction principle implementations involve some tedium but not terribly much creativity.
Suppose we want to extend our earlier type of binary trees to trees with arbitrary finite branching. We can use lists to give a simple definition.
Nested Inductive Types
This is an example of a nested inductive type definition, because we use the type we are defining as an argument to a parameterized type family. Coq will not allow all such definitions; it effectively pretends that we are defining nat_tree mutually with a version of list specialized to nat_tree, checking that the resulting expanded definition satisfies the usual rules. For instance, if we replaced list with a type family that used its parameter as a function argument, then the definition would be rejected as violating the positivity restriction.
As we encountered with mutual inductive types, we find that the automatically generated induction principle for nat_tree is too weak.
nat_tree_ind
: forall P : nat_tree -> Prop,
(forall (n : nat) (l : list nat_tree), P (NNode' n l)) ->
forall n : nat_tree, P n
Section All.
Variable T : Set.
Variable P : T -> Prop.
Fixpoint All (ls : list T) : Prop :=
match ls with
| Nil => True
| Cons h t => P h /\ All t
end.
End All.
It will be useful to review the definitions of True and /\, since we will want to write manual proofs of them below.
Print True.
Inductive True : Prop := I : True
Locate "/\".
Print and.
Inductive and (A : Prop) (B : Prop) : Prop := conj : A -> B -> A /\ B
For conj: Arguments A, B are implicit
Section nat_tree_ind'.
Variable P : nat_tree -> Prop.
Hypothesis NNode'_case : forall (n : nat) (ls : list nat_tree),
All P ls -> P (NNode' n ls).
A first attempt at writing the induction principle itself follows the intuition that nested inductive type definitions are expanded into mutual inductive definitions.
Fixpoint nat_tree_ind' (tr : nat_tree) : P tr :=
match tr with
| NNode' n ls => NNode'_case n ls (list_nat_tree_ind ls)
end
with list_nat_tree_ind (ls : list nat_tree) : All P ls :=
match ls with
| Nil => I
| Cons tr rest => conj (nat_tree_ind' tr) (list_nat_tree_ind rest)
end.
Coq rejects this definition, saying
There is no deep theoretical reason why this program should be rejected; Coq applies incomplete termination-checking heuristics, and it is necessary to learn a few of the most important rules. The term "nested inductive type" hints at the solution to this particular problem. Just as mutually inductive types require mutually recursive induction principles, nested types require nested recursion.
Fixpoint nat_tree_ind' (tr : nat_tree) : P tr :=
match tr with
| NNode' n ls => NNode'_case n ls (list_nat_tree_ind ls)
end
with list_nat_tree_ind (ls : list nat_tree) : All P ls :=
match ls with
| Nil => I
| Cons tr rest => conj (nat_tree_ind' tr) (list_nat_tree_ind rest)
end.
Recursive call to nat_tree_ind' has principal argument equal to "tr" instead of rest.
Fixpoint nat_tree_ind' (tr : nat_tree) : P tr :=
match tr with
| NNode' n ls => NNode'_case n ls
((fix list_nat_tree_ind (ls : list nat_tree) : All P ls :=
match ls with
| Nil => I
| Cons tr' rest => conj (nat_tree_ind' tr') (list_nat_tree_ind rest)
end) ls)
end.
We include an anonymous fix version of list_nat_tree_ind that is literally nested inside the definition of the recursive function corresponding to the inductive definition that had the nested use of list.
We can try our induction principle out by defining some recursive functions on nat_tree and proving a theorem about them. First, we define some helper functions that operate on lists.
Section map.
Variables T T' : Set.
Variable F : T -> T'.
Fixpoint map (ls : list T) : list T' :=
match ls with
| Nil => Nil
| Cons h t => Cons (F h) (map t)
end.
End map.
Fixpoint sum (ls : list nat) : nat :=
match ls with
| Nil => O
| Cons h t => plus h (sum t)
end.
Now we can define a size function over our trees.
Fixpoint ntsize (tr : nat_tree) : nat :=
match tr with
| NNode' _ trs => S (sum (map ntsize trs))
end.
Notice that Coq was smart enough to expand the definition of map to verify that we are using proper nested recursion, even through a use of a higher-order function.
Fixpoint ntsplice (tr1 tr2 : nat_tree) : nat_tree :=
match tr1 with
| NNode' n Nil => NNode' n (Cons tr2 Nil)
| NNode' n (Cons tr trs) => NNode' n (Cons (ntsplice tr tr2) trs)
end.
We have defined another arbitrary notion of tree splicing, similar to before, and we can prove an analogous theorem about its relationship with tree size. We start with a useful lemma about addition.
Now we begin the proof of the theorem, adding the lemma plus_S as a hint.
Theorem ntsize_ntsplice : forall tr1 tr2 : nat_tree, ntsize (ntsplice tr1 tr2)
= plus (ntsize tr2) (ntsize tr1).
Hint Rewrite plus_S.
We know that the standard induction principle is insufficient for the task, so we need to provide a using clause for the induction tactic to specify our alternate principle.
One subgoal remains:
n : nat
ls : list nat_tree
H : All
(fun tr1 : nat_tree =>
forall tr2 : nat_tree,
ntsize (ntsplice tr1 tr2) = plus (ntsize tr2) (ntsize tr1)) ls
tr2 : nat_tree
============================
ntsize
match ls with
| Nil => NNode' n (Cons tr2 Nil)
| Cons tr trs => NNode' n (Cons (ntsplice tr tr2) trs)
end = S (plus (ntsize tr2) (sum (map ntsize ls)))
After a few moments of squinting at this goal, it becomes apparent that we need to do a case analysis on the structure of ls. The rest is routine.
n : nat
ls : list nat_tree
H : All
(fun tr1 : nat_tree =>
forall tr2 : nat_tree,
ntsize (ntsplice tr1 tr2) = plus (ntsize tr2) (ntsize tr1)) ls
tr2 : nat_tree
============================
ntsize
match ls with
| Nil => NNode' n (Cons tr2 Nil)
| Cons tr trs => NNode' n (Cons (ntsplice tr tr2) trs)
end = S (plus (ntsize tr2) (sum (map ntsize ls)))
destruct ls; crush.
We can go further in automating the proof by exploiting the hint mechanism.
Restart.
Hint Extern 1 (ntsize (match ?LS with Nil => _ | Cons _ _ => _ end) = _) =>
destruct LS; crush.
induction tr1 using nat_tree_ind'; crush.
Qed.
We will go into great detail on hints in a later chapter, but the only important thing to note here is that we register a pattern that describes a conclusion we expect to encounter during the proof. The pattern may contain unification variables, whose names are prefixed with question marks, and we may refer to those bound variables in a tactic that we ask to have run whenever the pattern matches.
The advantage of using the hint is not very clear here, because the original proof was so short. However, the hint has fundamentally improved the readability of our proof. Before, the proof referred to the local variable ls, which has an automatically generated name. To a human reading the proof script without stepping through it interactively, it was not clear where ls came from. The hint explains to the reader the process for choosing which variables to case analyze, and the hint can continue working even if the rest of the proof structure changes significantly.
It can be useful to understand how tactics like discriminate and injection work, so it is worth stepping through a manual proof of each kind. We will start with a proof fit for discriminate.
Manual Proofs About Constructors
We begin with the tactic red, which is short for "one step of reduction," to unfold the definition of logical negation.
red.
============================
true = false -> False
intro H.
H : true = false
============================
False
It is worth recalling the difference between the lowercase and uppercase versions of truth and falsehood: True and False are logical propositions, while true and false are Boolean values that we can case-analyze. We have defined toProp such that our conclusion of False is computationally equivalent to toProp false. Thus, the change tactic will let us change the conclusion to toProp false. The general form change e replaces the conclusion with e, whenever Coq's built-in computation rules suffice to establish the equivalence of e with the original conclusion.
H : true = false
============================
toProp false
rewrite <- H.
H : true = false
============================
toProp true
simpl.
trivial.
Qed.
I have no trivial automated version of this proof to suggest, beyond using discriminate or congruence in the first place.
We can perform a similar manual proof of injectivity of the constructor S. I leave a walk-through of the details to curious readers who want to run the proof script interactively.
Theorem S_inj' : forall n m : nat, S n = S m -> n = m.
intros n m H.
change (pred (S n) = pred (S m)).
rewrite H.
reflexivity.
Qed.
The key piece of creativity in this theorem comes in the use of the natural number predecessor function pred. Embodied in the implementation of injection is a generic recipe for writing such type-specific functions.
The examples in this section illustrate an important aspect of the design philosophy behind Coq. We could certainly design a Gallina replacement that built in rules for constructor discrimination and injectivity, but a simpler alternative is to include a few carefully chosen rules that enable the desired reasoning patterns and many others. A key benefit of this philosophy is that the complexity of proof checking is minimized, which bolsters our confidence that proved theorems are really true.