comparison src/InductiveTypes.v @ 471:51a8472aca68

Batch of changes based on proofreader feedback
author Adam Chlipala <adam@chlipala.net>
date Mon, 08 Oct 2012 16:04:49 -0400
parents 0df6dde807ab
children 524458532b76
comparison
equal deleted inserted replaced
470:0df6dde807ab 471:51a8472aca68
18 18
19 (** %\part{Basic Programming and Proving} 19 (** %\part{Basic Programming and Proving}
20 20
21 \chapter{Introducing Inductive Types}% *) 21 \chapter{Introducing Inductive Types}% *)
22 22
23 (** In a sense, CIC is built from just two relatively straightforward features: function types and inductive types. From this modest foundation, we can prove effectively all of the theorems of math and carry out effectively all program verifications, with enough effort expended. This chapter introduces induction and recursion for functional programming in Coq. Most of our examples reproduce functionality from the Coq standard library, and we have tried to copy the standard library's choices of identifiers, where possible, so many of the definitions here are already available in the default Coq environment. 23 (** In a sense, CIC is built from just two relatively straightforward features: function types and inductive types. From this modest foundation, we can prove effectively all of the theorems of math and carry out effectively all program verifications, with enough effort expended. This chapter introduces induction and recursion for functional programming in Coq. Most of our examples reproduce functionality from the Coq standard library, and I have tried to copy the standard library's choices of identifiers, where possible, so many of the definitions here are already available in the default Coq environment.
24 24
25 The last chapter took a deep dive into some of the more advanced Coq features, to highlight the unusual approach that I advocate in this book. However, from this point on, we will rewind and go back to basics, presenting the relevant features of Coq in a more bottom-up manner. A useful first step is a discussion of the differences and relationships between proofs and programs in Coq. *) 25 The last chapter took a deep dive into some of the more advanced Coq features, to highlight the unusual approach that I advocate in this book. However, from this point on, we will rewind and go back to basics, presenting the relevant features of Coq in a more bottom-up manner. A useful first step is a discussion of the differences and relationships between proofs and programs in Coq. *)
26 26
27 27
28 (** * Proof Terms *) 28 (** * Proof Terms *)
62 62
63 (** %\smallskip{}%Every one of these example programs whose type looks like a logical formula is a%\index{proof term}% _proof term_. We use that name for any Gallina term of a logical type, and we will elaborate shortly on what makes a type logical. 63 (** %\smallskip{}%Every one of these example programs whose type looks like a logical formula is a%\index{proof term}% _proof term_. We use that name for any Gallina term of a logical type, and we will elaborate shortly on what makes a type logical.
64 64
65 In the rest of this chapter, we will introduce different ways of defining types. Every example type can be interpreted alternatively as a type of programs or proofs. 65 In the rest of this chapter, we will introduce different ways of defining types. Every example type can be interpreted alternatively as a type of programs or proofs.
66 66
67 One of the first types we introduce will be [bool], with constructors [true] and [false]. Newcomers to Coq often wonder about the distinction between [True] and [true] and the distinction between [False] and [false]. One glib answer is that [True] and [False] are types, but [true] and [false] are not. A more useful answer is that Coq's metatheory guarantees that any term of type [bool] _evaluates_ to either [true] or [false]. This means that we have an _algorithm_ for answering any question phrased as an expression of type [bool]. Conversely, most propositions do not evaluate to [True] or [False]; the language of inductively defined propositions is much richer than that. We ought to be glad that we have no algorithm for deciding mathematical truth, since otherwise it would be clear that we could not formalize undecidable properties, like almost any interesting property of general-purpose programs. *) 67 One of the first types we introduce will be [bool], with constructors [true] and [false]. Newcomers to Coq often wonder about the distinction between [True] and [true] and the distinction between [False] and [false]. One glib answer is that [True] and [False] are types, but [true] and [false] are not. A more useful answer is that Coq's metatheory guarantees that any term of type [bool] _evaluates_ to either [true] or [false]. This means that we have an _algorithm_ for answering any question phrased as an expression of type [bool]. Conversely, most propositions do not evaluate to [True] or [False]; the language of inductively defined propositions is much richer than that. We ought to be glad that we have no algorithm for deciding our formalized version of mathematical truth, since otherwise it would be clear that we could not formalize undecidable properties, like almost any interesting property of general-purpose programs. *)
68 68
69 69
70 (** * Enumerations *) 70 (** * Enumerations *)
71 71
72 (** Coq inductive types generalize the %\index{algebraic datatypes}%algebraic datatypes found in %\index{Haskell}%Haskell and %\index{ML}%ML. Confusingly enough, inductive types also generalize %\index{generalized algebraic datatypes}%generalized algebraic datatypes (GADTs), by adding the possibility for type dependency. Even so, it is worth backing up from the examples of the last chapter and going over basic, algebraic datatype uses of inductive datatypes, because the chance to prove things about the values of these types adds new wrinkles beyond usual practice in Haskell and ML. 72 (** Coq inductive types generalize the %\index{algebraic datatypes}%algebraic datatypes found in %\index{Haskell}%Haskell and %\index{ML}%ML. Confusingly enough, inductive types also generalize %\index{generalized algebraic datatypes}%generalized algebraic datatypes (GADTs), by adding the possibility for type dependency. Even so, it is worth backing up from the examples of the last chapter and going over basic, algebraic datatype uses of inductive datatypes, because the chance to prove things about the values of these types adds new wrinkles beyond usual practice in Haskell and ML.
74 The singleton type [unit] is an inductive type:%\index{Gallina terms!unit}\index{Gallina terms!tt}% *) 74 The singleton type [unit] is an inductive type:%\index{Gallina terms!unit}\index{Gallina terms!tt}% *)
75 75
76 Inductive unit : Set := 76 Inductive unit : Set :=
77 | tt. 77 | tt.
78 78
79 (** This vernacular command defines a new inductive type [unit] whose only value is [tt], as we can see by checking the types of the two identifiers: *) 79 (** This vernacular command defines a new inductive type [unit] whose only value is [tt]. We can verify the types of the two identifiers we introduce: *)
80 80
81 Check unit. 81 Check unit.
82 (** [unit : Set] *) 82 (** [unit : Set] *)
83 83
84 Check tt. 84 Check tt.
320 (* begin thide *) 320 (* begin thide *)
321 injection 1; trivial. 321 injection 1; trivial.
322 Qed. 322 Qed.
323 (* end thide *) 323 (* end thide *)
324 324
325 (** The [injection] tactic refers to a premise by number, adding new equalities between the corresponding arguments of equated terms that are formed with the same constructor. We end up needing to prove [n = m -> n = m], so it is unsurprising that a tactic named [trivial] is able to finish the proof. 325 (** The [injection] tactic refers to a premise by number, adding new equalities between the corresponding arguments of equated terms that are formed with the same constructor. We end up needing to prove [n = m -> n = m], so it is unsurprising that a tactic named [trivial] is able to finish the proof. This tactic attempts a variety of single proof steps, drawn from a user-specified database that we will later see how to extend.
326 326
327 There is also a very useful tactic called %\index{tactics!congruence}%[congruence] that can prove this theorem immediately. The [congruence] tactic generalizes [discriminate] and [injection], and it also adds reasoning about the general properties of equality, such as that a function returns equal results on equal arguments. That is, [congruence] is a%\index{theory of equality and uninterpreted functions}% _complete decision procedure for the theory of equality and uninterpreted functions_, plus some smarts about inductive types. 327 There is also a very useful tactic called %\index{tactics!congruence}%[congruence] that can prove this theorem immediately. The [congruence] tactic generalizes [discriminate] and [injection], and it also adds reasoning about the general properties of equality, such as that a function returns equal results on equal arguments. That is, [congruence] is a%\index{theory of equality and uninterpreted functions}% _complete decision procedure for the theory of equality and uninterpreted functions_, plus some smarts about inductive types.
328 328
329 %\medskip% 329 %\medskip%
330 330
366 forall n : nat_list, P n 366 forall n : nat_list, P n
367 ]] 367 ]]
368 368
369 %\medskip% 369 %\medskip%
370 370
371 In general, we can implement any "tree" types as inductive types. For example, here are binary trees of naturals. *) 371 In general, we can implement any "tree" type as an inductive type. For example, here are binary trees of naturals. *)
372 372
373 Inductive nat_btree : Set := 373 Inductive nat_btree : Set :=
374 | NLeaf : nat_btree 374 | NLeaf : nat_btree
375 | NNode : nat_btree -> nat -> nat_btree -> nat_btree. 375 | NNode : nat_btree -> nat -> nat_btree -> nat_btree.
376
377 (** Here are two functions whose intuitive explanations are not so important. The first one computes the size of a tree, and the second performs some sort of splicing of one tree into the leftmost available leaf node of another. *)
376 378
377 Fixpoint nsize (tr : nat_btree) : nat := 379 Fixpoint nsize (tr : nat_btree) : nat :=
378 match tr with 380 match tr with
379 | NLeaf => S O 381 | NLeaf => S O
380 | NNode tr1 _ tr2 => plus (nsize tr1) (nsize tr2) 382 | NNode tr1 _ tr2 => plus (nsize tr1) (nsize tr2)
399 401
400 induction tr1; crush. 402 induction tr1; crush.
401 Qed. 403 Qed.
402 (* end thide *) 404 (* end thide *)
403 405
404 (** It is convenient that these proofs go through so easily, but it is useful to check that the tree induction principle works as usual. *) 406 (** It is convenient that these proofs go through so easily, but it is still useful to look into the details of what happened, by checking the statement of the tree induction principle. *)
405 407
406 Check nat_btree_ind. 408 Check nat_btree_ind.
407 (** %\vspace{-.15in}% [[ 409 (** %\vspace{-.15in}% [[
408 nat_btree_ind 410 nat_btree_ind
409 : forall P : nat_btree -> Prop, 411 : forall P : nat_btree -> Prop,
503 P (Nil T) -> 505 P (Nil T) ->
504 (forall (t : T) (l : list T), P l -> P (Cons t l)) -> 506 (forall (t : T) (l : list T), P l -> P (Cons t l)) ->
505 forall l : list T, P l 507 forall l : list T, P l
506 ]] 508 ]]
507 509
508 Thus, even though we just saw that [T] is added as an extra argument to the constructor [Cons], there is no quantifier for [T] in the type of the inductive case like there is for each of the other arguments. *) 510 Thus, even though we just saw that [T] is added as an extra argument to the constructor [Cons], the inductive case in the type of [list_ind] (i.e., the third line of the type) includes no quantifier for [T], even though all of the other arguments are quantified explicitly. Parameters in other inductive definitions are treated similarly in stating induction principles. *)
509 511
510 512
511 (** * Mutually Inductive Types *) 513 (** * Mutually Inductive Types *)
512 514
513 (** We can define inductive types that refer to each other: *) 515 (** We can define inductive types that refer to each other: *)
584 (forall (n : nat) (o : odd_list), P0 o -> P (ECons n o)) -> 586 (forall (n : nat) (o : odd_list), P0 o -> P (ECons n o)) ->
585 (forall (n : nat) (e : even_list), P e -> P0 (OCons n e)) -> 587 (forall (n : nat) (e : even_list), P e -> P0 (OCons n e)) ->
586 forall e : even_list, P e 588 forall e : even_list, P e
587 ]] 589 ]]
588 590
589 This is the principle we wanted in the first place. There is one more wrinkle left in using it: the [induction] tactic will not apply it for us automatically. It will be helpful to look at how to prove one of our past examples without using [induction], so that we can then generalize the technique to mutual inductive types.%\index{tactics!apply}% *) 591 This is the principle we wanted in the first place.
592
593 The [Scheme] command is for asking Coq to generate particular induction schemes that are mutual among a set of inductive types (possibly only one such type, in which case we get a normal induction principle). In a sense, it generalizes the induction scheme generation that goes on automatically for each inductive definition. Future Coq versions might make that automatic generation smarter, so that [Scheme] is needed in fewer places. In a few sections, we will see how induction principles are derived theorems in Coq, so that there is not actually any need to build in _any_ automatic scheme generation.
594
595 There is one more wrinkle left in using the [even_list_mut] induction principle: the [induction] tactic will not apply it for us automatically. It will be helpful to look at how to prove one of our past examples without using [induction], so that we can then generalize the technique to mutual inductive types.%\index{tactics!apply}% *)
590 596
591 Theorem n_plus_O' : forall n : nat, plus n O = n. 597 Theorem n_plus_O' : forall n : nat, plus n O = n.
598 apply nat_ind.
599 (** Here we use [apply], which is one of the most essential basic tactics. When we are trying to prove fact [P], and when [thm] is a theorem whose conclusion can be made to match [P] by proper choice of quantified variable values, the invocation [apply thm] will replace the current goal with one new goal for each premise of [thm].
600
601 This use of [apply] may seem a bit _too_ magical. To better see what is going on, we use a variant where we partially apply the theorem [nat_ind] to give an explicit value for the predicate that gives our induction hypothesis. *)
602
603 Undo.
592 apply (nat_ind (fun n => plus n O = n)); crush. 604 apply (nat_ind (fun n => plus n O = n)); crush.
593 Qed. 605 Qed.
594 606
595 (** From this example, we can see that [induction] is not magic. It only does some bookkeeping for us to make it easy to apply a theorem, which we can do directly with the [apply] tactic. We apply not just an identifier but a partial application of it, specifying the predicate we mean to prove holds for all naturals. 607 (** From this example, we can see that [induction] is not magic. It only does some bookkeeping for us to make it easy to apply a theorem, which we can do directly with the [apply] tactic.
596 608
597 This technique generalizes to our mutual example: *) 609 This technique generalizes to our mutual example: *)
598 610
599 Theorem elength_eapp : forall el1 el2 : even_list, 611 Theorem elength_eapp : forall el1 el2 : even_list,
600 elength (eapp el1 el2) = plus (elength el1) (elength el2). 612 elength (eapp el1 el2) = plus (elength el1) (elength el2).
903 (** * Nested Inductive Types *) 915 (** * Nested Inductive Types *)
904 916
905 (** Suppose we want to extend our earlier type of binary trees to trees with arbitrary finite branching. We can use lists to give a simple definition. *) 917 (** Suppose we want to extend our earlier type of binary trees to trees with arbitrary finite branching. We can use lists to give a simple definition. *)
906 918
907 Inductive nat_tree : Set := 919 Inductive nat_tree : Set :=
908 | NLeaf' : nat_tree
909 | NNode' : nat -> list nat_tree -> nat_tree. 920 | NNode' : nat -> list nat_tree -> nat_tree.
910 921
911 (** This is an example of a%\index{nested inductive type}% _nested_ inductive type definition, because we use the type we are defining as an argument to a parameterized type family. Coq will not allow all such definitions; it effectively pretends that we are defining [nat_tree] mutually with a version of [list] specialized to [nat_tree], checking that the resulting expanded definition satisfies the usual rules. For instance, if we replaced [list] with a type family that used its parameter as a function argument, then the definition would be rejected as violating the positivity restriction. 922 (** This is an example of a%\index{nested inductive type}% _nested_ inductive type definition, because we use the type we are defining as an argument to a parameterized type family. Coq will not allow all such definitions; it effectively pretends that we are defining [nat_tree] mutually with a version of [list] specialized to [nat_tree], checking that the resulting expanded definition satisfies the usual rules. For instance, if we replaced [list] with a type family that used its parameter as a function argument, then the definition would be rejected as violating the positivity restriction.
912 923
913 Like we encountered for mutual inductive types, we find that the automatically generated induction principle for [nat_tree] is too weak. *) 924 Like we encountered for mutual inductive types, we find that the automatically generated induction principle for [nat_tree] is too weak. *)
925
926 (* begin hide *)
927 (* begin thide *)
928 Check Forall.
929 (* end thide *)
930 (* end hide *)
914 931
915 Check nat_tree_ind. 932 Check nat_tree_ind.
916 (** %\vspace{-.15in}% [[ 933 (** %\vspace{-.15in}% [[
917 nat_tree_ind 934 nat_tree_ind
918 : forall P : nat_tree -> Prop, 935 : forall P : nat_tree -> Prop,
919 P NLeaf' ->
920 (forall (n : nat) (l : list nat_tree), P (NNode' n l)) -> 936 (forall (n : nat) (l : list nat_tree), P (NNode' n l)) ->
921 forall n : nat_tree, P n 937 forall n : nat_tree, P n
922 ]] 938 ]]
923 939
924 There is no command like [Scheme] that will implement an improved principle for us. In general, it takes creativity to figure out how to incorporate nested uses of different type families. This is roughly the same creativity employed in the traditional task of strengthening an induction hypothesis. Now that we know how to implement induction principles manually, we are in a position to apply just such creativity to this problem. 940 There is no command like [Scheme] that will implement an improved principle for us. In general, it takes creativity to figure out _good_ ways to incorporate nested uses of different type families. Now that we know how to implement induction principles manually, we are in a position to apply just such creativity to this problem.
925 941
926 First, we will need an auxiliary definition, characterizing what it means for a property to hold of every element of a list. *) 942 Many induction principles for types with nested used of [list] could benefit from a unified predicate capturing the idea that some property holds of every element in a list. By defining this generic predicate once, we facilitate reuse of library theorems about it. (Here, we are actually duplicating the standard library's [Forall] predicate, with a different implementation, for didactic purposes.) *)
927 943
928 Section All. 944 Section All.
929 Variable T : Set. 945 Variable T : Set.
930 Variable P : T -> Prop. 946 Variable P : T -> Prop.
931 947
969 Now we create a section for our induction principle, following the same basic plan as in the previous section of this chapter. *) 985 Now we create a section for our induction principle, following the same basic plan as in the previous section of this chapter. *)
970 986
971 Section nat_tree_ind'. 987 Section nat_tree_ind'.
972 Variable P : nat_tree -> Prop. 988 Variable P : nat_tree -> Prop.
973 989
974 Hypothesis NLeaf'_case : P NLeaf'.
975 Hypothesis NNode'_case : forall (n : nat) (ls : list nat_tree), 990 Hypothesis NNode'_case : forall (n : nat) (ls : list nat_tree),
976 All P ls -> P (NNode' n ls). 991 All P ls -> P (NNode' n ls).
977 992
978 (* begin hide *) 993 (* begin hide *)
979 (* begin thide *) 994 (* begin thide *)
984 (** A first attempt at writing the induction principle itself follows the intuition that nested inductive type definitions are expanded into mutual inductive definitions. 999 (** A first attempt at writing the induction principle itself follows the intuition that nested inductive type definitions are expanded into mutual inductive definitions.
985 1000
986 %\vspace{-.15in}%[[ 1001 %\vspace{-.15in}%[[
987 Fixpoint nat_tree_ind' (tr : nat_tree) : P tr := 1002 Fixpoint nat_tree_ind' (tr : nat_tree) : P tr :=
988 match tr with 1003 match tr with
989 | NLeaf' => NLeaf'_case
990 | NNode' n ls => NNode'_case n ls (list_nat_tree_ind ls) 1004 | NNode' n ls => NNode'_case n ls (list_nat_tree_ind ls)
991 end 1005 end
992 1006
993 with list_nat_tree_ind (ls : list nat_tree) : All P ls := 1007 with list_nat_tree_ind (ls : list nat_tree) : All P ls :=
994 match ls with 1008 match ls with
1005 1019
1006 There is no deep theoretical reason why this program should be rejected; Coq applies incomplete termination-checking heuristics, and it is necessary to learn a few of the most important rules. The term "nested inductive type" hints at the solution to this particular problem. Just like true mutually inductive types require mutually recursive induction principles, nested types require nested recursion. *) 1020 There is no deep theoretical reason why this program should be rejected; Coq applies incomplete termination-checking heuristics, and it is necessary to learn a few of the most important rules. The term "nested inductive type" hints at the solution to this particular problem. Just like true mutually inductive types require mutually recursive induction principles, nested types require nested recursion. *)
1007 1021
1008 Fixpoint nat_tree_ind' (tr : nat_tree) : P tr := 1022 Fixpoint nat_tree_ind' (tr : nat_tree) : P tr :=
1009 match tr with 1023 match tr with
1010 | NLeaf' => NLeaf'_case
1011 | NNode' n ls => NNode'_case n ls 1024 | NNode' n ls => NNode'_case n ls
1012 ((fix list_nat_tree_ind (ls : list nat_tree) : All P ls := 1025 ((fix list_nat_tree_ind (ls : list nat_tree) : All P ls :=
1013 match ls with 1026 match ls with
1014 | Nil => I 1027 | Nil => I
1015 | Cons tr rest => conj (nat_tree_ind' tr) (list_nat_tree_ind rest) 1028 | Cons tr' rest => conj (nat_tree_ind' tr') (list_nat_tree_ind rest)
1016 end) ls) 1029 end) ls)
1017 end. 1030 end.
1018 1031
1019 (** We include an anonymous [fix] version of [list_nat_tree_ind] that is literally _nested_ inside the definition of the recursive function corresponding to the inductive definition that had the nested use of [list]. *) 1032 (** We include an anonymous [fix] version of [list_nat_tree_ind] that is literally _nested_ inside the definition of the recursive function corresponding to the inductive definition that had the nested use of [list]. *)
1020 1033
1041 1054
1042 (** Now we can define a size function over our trees. *) 1055 (** Now we can define a size function over our trees. *)
1043 1056
1044 Fixpoint ntsize (tr : nat_tree) : nat := 1057 Fixpoint ntsize (tr : nat_tree) : nat :=
1045 match tr with 1058 match tr with
1046 | NLeaf' => S O
1047 | NNode' _ trs => S (sum (map ntsize trs)) 1059 | NNode' _ trs => S (sum (map ntsize trs))
1048 end. 1060 end.
1049 1061
1050 (** Notice that Coq was smart enough to expand the definition of [map] to verify that we are using proper nested recursion, even through a use of a higher-order function. *) 1062 (** Notice that Coq was smart enough to expand the definition of [map] to verify that we are using proper nested recursion, even through a use of a higher-order function. *)
1051 1063
1052 Fixpoint ntsplice (tr1 tr2 : nat_tree) : nat_tree := 1064 Fixpoint ntsplice (tr1 tr2 : nat_tree) : nat_tree :=
1053 match tr1 with 1065 match tr1 with
1054 | NLeaf' => NNode' O (Cons tr2 Nil)
1055 | NNode' n Nil => NNode' n (Cons tr2 Nil) 1066 | NNode' n Nil => NNode' n (Cons tr2 Nil)
1056 | NNode' n (Cons tr trs) => NNode' n (Cons (ntsplice tr tr2) trs) 1067 | NNode' n (Cons tr trs) => NNode' n (Cons (ntsplice tr tr2) trs)
1057 end. 1068 end.
1058 1069
1059 (** We have defined another arbitrary notion of tree splicing, similar to before, and we can prove an analogous theorem about its relationship with tree size. We start with a useful lemma about addition. *) 1070 (** We have defined another arbitrary notion of tree splicing, similar to before, and we can prove an analogous theorem about its relationship with tree size. We start with a useful lemma about addition. *)