comparison src/DataStruct.v @ 284:693897f8e0cb

PC comments for DataStruct
author Adam Chlipala <adam@chlipala.net>
date Fri, 05 Nov 2010 13:48:39 -0400
parents d8c54a25c81f
children 1b6c81e51799
comparison
equal deleted inserted replaced
283:756ce68e42fb 284:693897f8e0cb
1 (* Copyright (c) 2008-2009, Adam Chlipala 1 (* Copyright (c) 2008-2010, Adam Chlipala
2 * 2 *
3 * This work is licensed under a 3 * This work is licensed under a
4 * Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 4 * Creative Commons Attribution-Noncommercial-No Derivative Works 3.0
5 * Unported License. 5 * Unported License.
6 * The license text is available at: 6 * The license text is available at:
30 30
31 Inductive ilist : nat -> Set := 31 Inductive ilist : nat -> Set :=
32 | Nil : ilist O 32 | Nil : ilist O
33 | Cons : forall n, A -> ilist n -> ilist (S n). 33 | Cons : forall n, A -> ilist n -> ilist (S n).
34 34
35 (** We might like to have a certified function for selecting an element of an [ilist] by position. We could do this using subset types and explicit manipulation of proofs, but dependent types let us do it more directly. It is helpful to define a type family [fin], where [fin n] is isomorphic to [{m : nat | m < n}]. The type family names stands for "finite." *) 35 (** We might like to have a certified function for selecting an element of an [ilist] by position. We could do this using subset types and explicit manipulation of proofs, but dependent types let us do it more directly. It is helpful to define a type family [fin], where [fin n] is isomorphic to [{m : nat | m < n}]. The type family name stands for %``%#"#finite.#"#%''% *)
36 36
37 (* EX: Define a function [get] for extracting an [ilist] element by position. *) 37 (* EX: Define a function [get] for extracting an [ilist] element by position. *)
38 38
39 (* begin thide *) 39 (* begin thide *)
40 Inductive fin : nat -> Set := 40 Inductive fin : nat -> Set :=
41 | First : forall n, fin (S n) 41 | First : forall n, fin (S n)
42 | Next : forall n, fin n -> fin (S n). 42 | Next : forall n, fin n -> fin (S n).
43 43
44 (** [fin] essentially makes a more richly-typed copy of the natural numbers. Every element is a [First] iterated through applying [Next] a number of times that indicates which number is being selected. 44 (** [fin] essentially makes a more richly-typed copy of the natural numbers. Every element is a [First] iterated through applying [Next] a number of times that indicates which number is being selected. For instance, the three values of type [fin 3] are [First 2], [Next (First 1)], and [Next (Next (First 0))].
45 45
46 Now it is easy to pick a [Prop]-free type for a selection function. As usual, our first implementation attempt will not convince the type checker, and we will attack the deficiencies one at a time. 46 Now it is easy to pick a [Prop]-free type for a selection function. As usual, our first implementation attempt will not convince the type checker, and we will attack the deficiencies one at a time.
47 47
48 [[ 48 [[
49 Fixpoint get n (ls : ilist n) : fin n -> A := 49 Fixpoint get n (ls : ilist n) : fin n -> A :=
78 end 78 end
79 end. 79 end.
80 80
81 ]] 81 ]]
82 82
83 Now the first [match] case type-checks, and we see that the problem with the [Cons] case is that the pattern-bound variable [idx'] does not have an apparent type compatible with [ls']. We need to use [match] annotations to make the relationship explicit. Unfortunately, the usual trick of postponing argument binding will not help us here. We need to match on both [ls] and [idx]; one or the other must be matched first. To get around this, we apply the convoy pattern that we met last chapter. This application is a little more clever than those we saw before; we use the natural number predecessor function [pred] to express the relationship between the types of these variables. 83 Now the first [match] case type-checks, and we see that the problem with the [Cons] case is that the pattern-bound variable [idx'] does not have an apparent type compatible with [ls']. In fact, the error message Coq gives for this exact code can be confusing, thanks to an overenthusiastic type inference heuristic. We are told that the [Nil] case body has type [match X with | 0 => A | S _ => unit end] for a unification variable [X], while it is expected to have type [A]. We can see that setting [X] to [0] resolves the conflict, but Coq is not yet smart enough to do this unification automatically. Repeating the function's type in a [return] annotation, used with an [in] annotation, leads us to a more informative error message, saying that [idx'] has type [fin n1] while it is expected to have type [fin n0], where [n0] is bound by the [Cons] pattern and [n1] by the [Next] pattern. As the code is written above, nothing forces these two natural numbers to be equal, though we know intuitively that they must be.
84
85 We need to use [match] annotations to make the relationship explicit. Unfortunately, the usual trick of postponing argument binding will not help us here. We need to match on both [ls] and [idx]; one or the other must be matched first. To get around this, we apply the convoy pattern that we met last chapter. This application is a little more clever than those we saw before; we use the natural number predecessor function [pred] to express the relationship between the types of these variables.
84 86
85 [[ 87 [[
86 Fixpoint get n (ls : ilist n) : fin n -> A := 88 Fixpoint get n (ls : ilist n) : fin n -> A :=
87 match ls with 89 match ls with
88 | Nil => fun idx => 90 | Nil => fun idx =>
166 match ls with 168 match ls with
167 | Nil => Nil 169 | Nil => Nil
168 | Cons _ x ls' => Cons (f x) (imap ls') 170 | Cons _ x ls' => Cons (f x) (imap ls')
169 end. 171 end.
170 172
171 (** It is easy to prove that [get] "distributes over" [imap] calls. The only tricky bit is remembering to use the [dep_destruct] tactic in place of plain [destruct] when faced with a baffling tactic error message. *) 173 (** It is easy to prove that [get] %``%#"#distributes over#"#%''% [imap] calls. The only tricky bit is remembering to use our [dep_destruct] tactic in place of plain [destruct] when faced with a baffling tactic error message. *)
172 174
173 Theorem get_imap : forall n (idx : fin n) (ls : ilist A n), 175 Theorem get_imap : forall n (idx : fin n) (ls : ilist A n),
174 get (imap ls) idx = f (get ls idx). 176 get (imap ls) idx = f (get ls idx).
175 (* begin thide *) 177 (* begin thide *)
176 induction ls; dep_destruct idx; crush. 178 induction ls; dep_destruct idx; crush.
179 End ilist_map. 181 End ilist_map.
180 182
181 183
182 (** * Heterogeneous Lists *) 184 (** * Heterogeneous Lists *)
183 185
184 (** Programmers who move to statically-typed functional languages from "scripting languages" often complain about the requirement that every element of a list have the same type. With fancy type systems, we can partially lift this requirement. We can index a list type with a "type-level" list that explains what type each element of the list should have. This has been done in a variety of ways in Haskell using type classes, and we can do it much more cleanly and directly in Coq. *) 186 (** Programmers who move to statically-typed functional languages from %``%#"#scripting languages#"#%''% often complain about the requirement that every element of a list have the same type. With fancy type systems, we can partially lift this requirement. We can index a list type with a %``%#"#type-level#"#%''% list that explains what type each element of the list should have. This has been done in a variety of ways in Haskell using type classes, and we can do it much more cleanly and directly in Coq. *)
185 187
186 Section hlist. 188 Section hlist.
187 Variable A : Type. 189 Variable A : Type.
188 Variable B : A -> Type. 190 Variable B : A -> Type.
189 191
206 208
207 Inductive member : list A -> Type := 209 Inductive member : list A -> Type :=
208 | MFirst : forall ls, member (elm :: ls) 210 | MFirst : forall ls, member (elm :: ls)
209 | MNext : forall x ls, member ls -> member (x :: ls). 211 | MNext : forall x ls, member ls -> member (x :: ls).
210 212
211 (** Because the element [elm] that we are "searching for" in a list does not change across the constructors of [member], we simplify our definitions by making [elm] a local variable. In the definition of [member], we say that [elm] is found in any list that begins with [elm], and, if removing the first element of a list leaves [elm] present, then [elm] is present in the original list, too. The form looks much like a predicate for list membership, but we purposely define [member] in [Type] so that we may decompose its values to guide computations. 213 (** Because the element [elm] that we are %``%#"#searching for#"#%''% in a list does not change across the constructors of [member], we simplify our definitions by making [elm] a local variable. In the definition of [member], we say that [elm] is found in any list that begins with [elm], and, if removing the first element of a list leaves [elm] present, then [elm] is present in the original list, too. The form looks much like a predicate for list membership, but we purposely define [member] in [Type] so that we may decompose its values to guide computations.
212 214
213 We can use [member] to adapt our definition of [get] to [hlists]. The same basic [match] tricks apply. In the [MCons] case, we form a two-element convoy, passing both the data element [x] and the recursor for the sublist [mls'] to the result of the inner [match]. We did not need to do that in [get]'s definition because the types of list elements were not dependent there. *) 215 We can use [member] to adapt our definition of [get] to [hlists]. The same basic [match] tricks apply. In the [MCons] case, we form a two-element convoy, passing both the data element [x] and the recursor for the sublist [mls'] to the result of the inner [match]. We did not need to do that in [get]'s definition because the types of list elements were not dependent there. *)
214 216
215 Fixpoint hget ls (mls : hlist ls) : member ls -> B elm := 217 Fixpoint hget ls (mls : hlist ls) : member ls -> B elm :=
216 match mls with 218 match mls with
280 282
281 Inductive type : Set := 283 Inductive type : Set :=
282 | Unit : type 284 | Unit : type
283 | Arrow : type -> type -> type. 285 | Arrow : type -> type -> type.
284 286
285 (** Now we can define a type family for expressions. An [exp ts t] will stand for an expression that has type [t] and whose free variables have types in the list [ts]. We effectively use the de Bruijn variable representation, which we will discuss in more detail in later chapters. Variables are represented as [member] values; that is, a variable is more or less a constructive proof that a particular type is found in the type environment. *) 287 (** Now we can define a type family for expressions. An [exp ts t] will stand for an expression that has type [t] and whose free variables have types in the list [ts]. We effectively use the de Bruijn variable representation, which we will discuss in more detail in later chapters, including a case study in Chapter 16. Variables are represented as [member] values; that is, a variable is more or less a constructive proof that a particular type is found in the type environment. *)
286 288
287 Inductive exp : list type -> type -> Set := 289 Inductive exp : list type -> type -> Set :=
288 | Const : forall ts, exp ts Unit 290 | Const : forall ts, exp ts Unit
289 (* begin thide *) 291 (* begin thide *)
290 | Var : forall ts t, member t ts -> exp ts t 292 | Var : forall ts t, member t ts -> exp ts t
354 (** We are starting to develop the tools behind dependent typing's amazing advantage over alternative approaches in several important areas. Here, we have implemented complete syntax, typing rules, and evaluation semantics for simply-typed lambda calculus without even needing to define a syntactic substitution operation. We did it all without a single line of proof, and our implementation is manifestly executable. In a later chapter, we will meet other, more common approaches to language formalization. Such approaches often state and prove explicit theorems about type safety of languages. In the above example, we got type safety, termination, and other meta-theorems for free, by reduction to CIC, which we know has those properties. *) 356 (** We are starting to develop the tools behind dependent typing's amazing advantage over alternative approaches in several important areas. Here, we have implemented complete syntax, typing rules, and evaluation semantics for simply-typed lambda calculus without even needing to define a syntactic substitution operation. We did it all without a single line of proof, and our implementation is manifestly executable. In a later chapter, we will meet other, more common approaches to language formalization. Such approaches often state and prove explicit theorems about type safety of languages. In the above example, we got type safety, termination, and other meta-theorems for free, by reduction to CIC, which we know has those properties. *)
355 357
356 358
357 (** * Recursive Type Definitions *) 359 (** * Recursive Type Definitions *)
358 360
359 (** There is another style of datatype definition that leads to much simpler definitions of the [get] and [hget] definitions above. Because Coq supports "type-level computation," we can redo our inductive definitions as %\textit{%#<i>#recursive#</i>#%}% definitions. *) 361 (** There is another style of datatype definition that leads to much simpler definitions of the [get] and [hget] definitions above. Because Coq supports %``%#"#type-level computation,#"#%''% we can redo our inductive definitions as %\textit{%#<i>#recursive#</i>#%}% definitions. *)
360 362
361 (* EX: Come up with an alternate [ilist] definition that makes it easier to write [get]. *) 363 (* EX: Come up with an alternate [ilist] definition that makes it easier to write [get]. *)
362 364
363 Section filist. 365 Section filist.
364 Variable A : Set. 366 Variable A : Set.
376 match n with 378 match n with
377 | O => Empty_set 379 | O => Empty_set
378 | S n' => option (ffin n') 380 | S n' => option (ffin n')
379 end. 381 end.
380 382
381 (** We express that there are no index values when [n = O], by defining such indices as type [Empty_set]; and we express that, at [n = S n'], there is a choice between picking the first element of the list (represented as [None]) or choosing a later element (represented by [Some idx], where [idx] is an index into the list tail). *) 383 (** We express that there are no index values when [n = O], by defining such indices as type [Empty_set]; and we express that, at [n = S n'], there is a choice between picking the first element of the list (represented as [None]) or choosing a later element (represented by [Some idx], where [idx] is an index into the list tail). For instance, the three values of type [ffin 3] are [None], [Some None], and [Some (Some None)]. *)
382 384
383 Fixpoint fget (n : nat) : filist n -> ffin n -> A := 385 Fixpoint fget (n : nat) : filist n -> ffin n -> A :=
384 match n with 386 match n with
385 | O => fun _ idx => match idx with end 387 | O => fun _ idx => match idx with end
386 | S n' => fun ls idx => 388 | S n' => fun ls idx =>
509 511
510 Theorem sum_inc : forall t, sum (inc t) >= sum t. 512 Theorem sum_inc : forall t, sum (inc t) >= sum t.
511 (* begin thide *) 513 (* begin thide *)
512 induction t; crush. 514 induction t; crush.
513 (** [[ 515 (** [[
514
515 n : nat 516 n : nat
516 i : ilist (tree nat) n 517 i : ilist (tree nat) n
517 ============================ 518 ============================
518 ifoldr (fun (t' : tree nat) (n0 : nat) => sum t' + n0) 0 (imap inc i) >= 519 ifoldr (fun (t' : tree nat) (n0 : nat) => sum t' + n0) 0 (imap inc i) >=
519 ifoldr (fun (t' : tree nat) (n0 : nat) => sum t' + n0) 0 i 520 ifoldr (fun (t' : tree nat) (n0 : nat) => sum t' + n0) 0 i
637 (* begin thide *) 638 (* begin thide *)
638 | Cond : forall n t, (ffin n -> exp' Bool) 639 | Cond : forall n t, (ffin n -> exp' Bool)
639 -> (ffin n -> exp' t) -> exp' t -> exp' t. 640 -> (ffin n -> exp' t) -> exp' t -> exp' t.
640 (* end thide *) 641 (* end thide *)
641 642
642 (** A [Cond] is parameterized by a natural [n], which tells us how many cases this conditional has. The test expressions are represented with a function of type [ffin n -> exp' Bool], and the bodies are represented with a function of type [ffin n -> exp' t], where [t] is the overall type. The final [exp' t] argument is the default case. 643 (** A [Cond] is parameterized by a natural [n], which tells us how many cases this conditional has. The test expressions are represented with a function of type [ffin n -> exp' Bool], and the bodies are represented with a function of type [ffin n -> exp' t], where [t] is the overall type. The final [exp' t] argument is the default case. For example, here is an expression that successively checks whether [2 + 2 = 5] (returning 0 if so) or if [1 + 1 = 2] (returning 1 if so), returning 2 otherwise. *)
643 644
644 We start implementing our interpreter with a standard type denotation function. *) 645 Example ex1 := Cond 2
646 (fun f => match f with
647 | None => Eq (Plus (NConst 2) (NConst 2)) (NConst 5)
648 | Some None => Eq (Plus (NConst 1) (NConst 1)) (NConst 2)
649 | Some (Some v) => match v with end
650 end)
651 (fun f => match f with
652 | None => NConst 0
653 | Some None => NConst 1
654 | Some (Some v) => match v with end
655 end)
656 (NConst 2).
657
658 (** We start implementing our interpreter with a standard type denotation function. *)
645 659
646 Definition type'Denote (t : type') : Set := 660 Definition type'Denote (t : type') : Set :=
647 match t with 661 match t with
648 | Nat => nat 662 | Nat => nat
649 | Bool => bool 663 | Bool => bool
828 842
829 (** It is not always clear which of these representation techniques to apply in a particular situation, but I will try to summarize the pros and cons of each. 843 (** It is not always clear which of these representation techniques to apply in a particular situation, but I will try to summarize the pros and cons of each.
830 844
831 Inductive types are often the most pleasant to work with, after someone has spent the time implementing some basic library functions for them, using fancy [match] annotations. Many aspects of Coq's logic and tactic support are specialized to deal with inductive types, and you may miss out if you use alternate encodings. 845 Inductive types are often the most pleasant to work with, after someone has spent the time implementing some basic library functions for them, using fancy [match] annotations. Many aspects of Coq's logic and tactic support are specialized to deal with inductive types, and you may miss out if you use alternate encodings.
832 846
833 Recursive types usually involve much less initial effort, but they can be less convenient to use with proof automation. For instance, the [simpl] tactic (which is among the ingredients in [crush]) will sometimes be overzealous in simplifying uses of functions over recursive types. Consider a call [get l f], where variable [l] has type [filist A (S n)]. The type of [l] would be simplified to an explicit pair type. In a proof involving many recursive types, this kind of unhelpful "simplification" can lead to rapid bloat in the sizes of subgoals. 847 Recursive types usually involve much less initial effort, but they can be less convenient to use with proof automation. For instance, the [simpl] tactic (which is among the ingredients in [crush]) will sometimes be overzealous in simplifying uses of functions over recursive types. Consider a call [get l f], where variable [l] has type [filist A (S n)]. The type of [l] would be simplified to an explicit pair type. In a proof involving many recursive types, this kind of unhelpful %``%#"#simplification#"#%''% can lead to rapid bloat in the sizes of subgoals.
834 848
835 Another disadvantage of recursive types is that they only apply to type families whose indices determine their "skeletons." This is not true for all data structures; a good counterexample comes from the richly-typed programming language syntax types we have used several times so far. The fact that a piece of syntax has type [Nat] tells us nothing about the tree structure of that syntax. 849 Another disadvantage of recursive types is that they only apply to type families whose indices determine their %``%#"#skeletons.#"#%''% This is not true for all data structures; a good counterexample comes from the richly-typed programming language syntax types we have used several times so far. The fact that a piece of syntax has type [Nat] tells us nothing about the tree structure of that syntax.
836 850
837 Reflexive encodings of data types are seen relatively rarely. As our examples demonstrated, manipulating index values manually can lead to hard-to-read code. A normal inductive type is generally easier to work with, once someone has gone through the trouble of implementing an induction principle manually with the techniques we studied in Chapter 3. For small developments, avoiding that kind of coding can justify the use of reflexive data structures. There are also some useful instances of co-inductive definitions with nested data structures (e.g., lists of values in the co-inductive type) that can only be deconstructed effectively with reflexive encoding of the nested structures. *) 851 Reflexive encodings of data types are seen relatively rarely. As our examples demonstrated, manipulating index values manually can lead to hard-to-read code. A normal inductive type is generally easier to work with, once someone has gone through the trouble of implementing an induction principle manually with the techniques we studied in Chapter 3. For small developments, avoiding that kind of coding can justify the use of reflexive data structures. There are also some useful instances of co-inductive definitions with nested data structures (e.g., lists of values in the co-inductive type) that can only be deconstructed effectively with reflexive encoding of the nested structures. *)
838 852
839 853
840 (** * Exercises *) 854 (** * Exercises *)
843 857
844 (** Some of the type family definitions and associated functions from this chapter are duplicated in the [DepList] module of the book source. Some of their names have been changed to be more sensible in a general context. 858 (** Some of the type family definitions and associated functions from this chapter are duplicated in the [DepList] module of the book source. Some of their names have been changed to be more sensible in a general context.
845 859
846 %\begin{enumerate}%#<ol># 860 %\begin{enumerate}%#<ol>#
847 861
848 %\item%#<li># Define a tree analogue of [hlist]. That is, define a parameterized type of binary trees with data at their leaves, and define a type family [htree] indexed by trees. The structure of an [htree] mirrors its index tree, with the type of each data element (which only occur at leaves) determined by applying a type function to the corresponding element of the index tree. Define a type standing for all possible paths from the root of a tree to leaves and use it to implement a function [tget] for extracting an element of an [htree] by path. Define a function [htmap2] for "mapping over two trees in parallel." That is, [htmap2] takes in two [htree]s with the same index tree, and it forms a new [htree] with the same index by applying a binary function pointwise. 862 %\item%#<li># Define a tree analogue of [hlist]. That is, define a parameterized type of binary trees with data at their leaves, and define a type family [htree] indexed by trees. The structure of an [htree] mirrors its index tree, with the type of each data element (which only occur at leaves) determined by applying a type function to the corresponding element of the index tree. Define a type standing for all possible paths from the root of a tree to leaves and use it to implement a function [tget] for extracting an element of an [htree] by path. Define a function [htmap2] for %``%#"#mapping over two trees in parallel.#"#%''% That is, [htmap2] takes in two [htree]s with the same index tree, and it forms a new [htree] with the same index by applying a binary function pointwise.
849 863
850 Repeat this process so that you implement each definition for each of the three definition styles covered in this chapter: inductive, recursive, and index function.#</li># 864 Repeat this process so that you implement each definition for each of the three definition styles covered in this chapter: inductive, recursive, and index function.#</li>#
851 865
852 %\item%#<li># Write a dependently-typed interpreter for a simple programming language with ML-style pattern-matching, using one of the encodings of heterogeneous lists to represent the different branches of a [case] expression. (There are other ways to represent the same thing, but the point of this exercise is to practice using those heterogeneous list types.) The object language is defined informally by this grammar: 866 %\item%#<li># Write a dependently-typed interpreter for a simple programming language with ML-style pattern-matching, using one of the encodings of heterogeneous lists to represent the different branches of a [case] expression. (There are other ways to represent the same thing, but the point of this exercise is to practice using those heterogeneous list types.) The object language is defined informally by this grammar:
853 867