Mercurial > cpdt > repo
diff src/StackMachine.v @ 398:05efde66559d
Get it working in Coq 8.4beta1; use nice coqdoc notation for italics
author | Adam Chlipala <adam@chlipala.net> |
---|---|
date | Wed, 06 Jun 2012 11:25:13 -0400 |
parents | cc8d0503619f |
children | 5986e9fd40b5 |
line wrap: on
line diff
--- a/src/StackMachine.v Sun May 06 17:15:15 2012 -0400 +++ b/src/StackMachine.v Wed Jun 06 11:25:13 2012 -0400 @@ -1,4 +1,4 @@ -(* Copyright (c) 2008-2011, Adam Chlipala +(* Copyright (c) 2008-2012, Adam Chlipala * * This work is licensed under a * Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 @@ -85,15 +85,15 @@ ]] -Languages like Haskell and ML have a convenient %\index{principal types}\index{type inference}\emph{%#<i>#principal types#</i>#%}% property, which gives us strong guarantees about how effective type inference will be. Unfortunately, Coq's type system is so expressive that any kind of %``%#"#complete#"#%''% type inference is impossible, and the task even seems to be hard heuristically in practice. Nonetheless, Coq includes some very helpful heuristics, many of them copying the workings of Haskell and ML type-checkers for programs that fall in simple fragments of Coq's language. +Languages like Haskell and ML have a convenient %\index{principal types}\index{type inference}%_principal types_ property, which gives us strong guarantees about how effective type inference will be. Unfortunately, Coq's type system is so expressive that any kind of %``%#"#complete#"#%''% type inference is impossible, and the task even seems to be hard heuristically in practice. Nonetheless, Coq includes some very helpful heuristics, many of them copying the workings of Haskell and ML type-checkers for programs that fall in simple fragments of Coq's language. -This is as good a time as any to mention the preponderance of different languages associated with Coq. The theoretical foundation of Coq is a formal system called the %\index{Calculus of Inductive Constructions}\index{CIC|see{Calculus of Inductive Constructions}}\emph{%#<i>#Calculus of Inductive Constructions (CIC)#</i>#%}~\cite{CIC}%, which is an extension of the older %\index{Calculus of Constructions}\index{CoC|see{Calculus of Constructions}}\emph{%#<i>#Calculus of Constructions (CoC)#</i>#%}~\cite{CoC}%. CIC is quite a spartan foundation, which is helpful for proving metatheory but not so helpful for real development. Still, it is nice to know that it has been proved that CIC enjoys properties like %\index{strong normalization}\emph{%#<i>#strong normalization#</i>#%}~\cite{CIC}%, meaning that every program (and, more importantly, every proof term) terminates; and %\index{relative consistency}\emph{%#<i>#relative consistency#</i>#%}~\cite{SetsInTypes}% with systems like versions of %\index{Zermelo-Fraenkel set theory}%Zermelo-Fraenkel set theory, which roughly means that you can believe that Coq proofs mean that the corresponding propositions are %``%#"#really true,#"#%''% if you believe in set theory. +This is as good a time as any to mention the preponderance of different languages associated with Coq. The theoretical foundation of Coq is a formal system called the %\index{Calculus of Inductive Constructions}\index{CIC|see{Calculus of Inductive Constructions}}%_Calculus of Inductive Constructions (CIC)_ %\cite{CIC}%, which is an extension of the older %\index{Calculus of Constructions}\index{CoC|see{Calculus of Constructions}}%_Calculus of Constructions (CoC)_ %\cite{CoC}%. CIC is quite a spartan foundation, which is helpful for proving metatheory but not so helpful for real development. Still, it is nice to know that it has been proved that CIC enjoys properties like %\index{strong normalization}%_strong normalization_ %\cite{CIC}%, meaning that every program (and, more importantly, every proof term) terminates; and %\index{relative consistency}%_relative consistency_ %\cite{SetsInTypes}% with systems like versions of %\index{Zermelo-Fraenkel set theory}%Zermelo-Fraenkel set theory, which roughly means that you can believe that Coq proofs mean that the corresponding propositions are %``%#"#really true,#"#%''% if you believe in set theory. -Coq is actually based on an extension of CIC called %\index{Gallina}\emph{%#<i>#Gallina#</i>#%}%. The text after the [:=] and before the period in the last code example is a term of Gallina. Gallina adds many useful features that are not compiled internally to more primitive CIC features. The important metatheorems about CIC have not been extended to the full breadth of these features, but most Coq users do not seem to lose much sleep over this omission. +Coq is actually based on an extension of CIC called %\index{Gallina}%_Gallina_. The text after the [:=] and before the period in the last code example is a term of Gallina. Gallina adds many useful features that are not compiled internally to more primitive CIC features. The important metatheorems about CIC have not been extended to the full breadth of these features, but most Coq users do not seem to lose much sleep over this omission. -Next, there is %\index{Ltac}\emph{%#<i>#Ltac#</i>#%}%, Coq's domain-specific language for writing proofs and decision procedures. We will see some basic examples of Ltac later in this chapter, and much of this book is devoted to more involved Ltac examples. +Next, there is %\index{Ltac}%_Ltac_, Coq's domain-specific language for writing proofs and decision procedures. We will see some basic examples of Ltac later in this chapter, and much of this book is devoted to more involved Ltac examples. -Finally, commands like [Inductive] and [Definition] are part of %\index{Vernacular commands}\emph{%#<i>#the Vernacular#</i>#%}%, which includes all sorts of useful queries and requests to the Coq system. Every Coq source file is a series of vernacular commands, where many command forms take arguments that are Gallina or Ltac programs. (Actually, Coq source files are more like %\emph{%#<i>#trees#</i>#%}% of vernacular commands, thanks to various nested scoping constructs.) +Finally, commands like [Inductive] and [Definition] are part of %\index{Vernacular commands}%_the Vernacular_, which includes all sorts of useful queries and requests to the Coq system. Every Coq source file is a series of vernacular commands, where many command forms take arguments that are Gallina or Ltac programs. (Actually, Coq source files are more like _trees_ of vernacular commands, thanks to various nested scoping constructs.) %\medskip% @@ -107,7 +107,7 @@ (** We declare explicitly that this is a recursive definition, using the keyword [Fixpoint]. The rest should be old hat for functional programmers. *) -(** It is convenient to be able to test definitions before starting to prove things about them. We can verify that our semantics is sensible by evaluating some sample uses, using the command %\index{Vernacular commands!Eval}%[Eval]. This command takes an argument expressing a %\index{reduction strategy}\emph{%#<i>#reduction strategy#</i>#%}%, or an %``%#"#order of evaluation.#"#%''% Unlike with ML, which hardcodes an %\emph{%#<i>#eager#</i>#%}% reduction strategy, or Haskell, which hardcodes a %\emph{%#<i>#lazy#</i>#%}% strategy, in Coq we are free to choose between these and many other orders of evaluation, because all Coq programs terminate. In fact, Coq silently checked %\index{termination checking}%termination of our [Fixpoint] definition above, using a simple heuristic based on monotonically decreasing size of arguments across recursive calls. Specifically, recursive calls must be made on arguments that were pulled out of the original recursive argument with [match] expressions. (In Chapter 7, we will see some ways of getting around this restriction, though simply removing the restriction would leave Coq useless as a theorem proving tool, for reasons we will start to learn about in the next chapter.) +(** It is convenient to be able to test definitions before starting to prove things about them. We can verify that our semantics is sensible by evaluating some sample uses, using the command %\index{Vernacular commands!Eval}%[Eval]. This command takes an argument expressing a %\index{reduction strategy}%_reduction strategy_, or an %``%#"#order of evaluation.#"#%''% Unlike with ML, which hardcodes an _eager_ reduction strategy, or Haskell, which hardcodes a _lazy_ strategy, in Coq we are free to choose between these and many other orders of evaluation, because all Coq programs terminate. In fact, Coq silently checked %\index{termination checking}%termination of our [Fixpoint] definition above, using a simple heuristic based on monotonically decreasing size of arguments across recursive calls. Specifically, recursive calls must be made on arguments that were pulled out of the original recursive argument with [match] expressions. (In Chapter 7, we will see some ways of getting around this restriction, though simply removing the restriction would leave Coq useless as a theorem proving tool, for reasons we will start to learn about in the next chapter.) To return to our test evaluations, we run the [Eval] command using the [simpl] evaluation strategy, whose definition is best postponed until we have learned more about Coq's foundations, but which usually gets the job done. *) @@ -203,12 +203,12 @@ (* end hide *) (* begin thide *) -(** Though a pencil-and-paper proof might clock out at this point, writing %``%#"#by a routine induction on [e],#"#%''% it turns out not to make sense to attack this proof directly. We need to use the standard trick of %\index{strengthening the induction hypothesis}\emph{%#<i>#strengthening the induction hypothesis#</i>#%}%. We do that by proving an auxiliary lemma, using the command [Lemma] that is a synonym for [Theorem], conventionally used for less important theorems that appear in the proofs of primary theorems.%\index{Vernacular commands!Lemma}% *) +(** Though a pencil-and-paper proof might clock out at this point, writing %``%#"#by a routine induction on [e],#"#%''% it turns out not to make sense to attack this proof directly. We need to use the standard trick of %\index{strengthening the induction hypothesis}%_strengthening the induction hypothesis_. We do that by proving an auxiliary lemma, using the command [Lemma] that is a synonym for [Theorem], conventionally used for less important theorems that appear in the proofs of primary theorems.%\index{Vernacular commands!Lemma}% *) Lemma compile_correct' : forall e p s, progDenote (compile e ++ p) s = progDenote p (expDenote e :: s). -(** After the period in the [Lemma] command, we are in %\index{interactive proof-editing mode}\emph{%#<i>#the interactive proof-editing mode#</i>#%}%. We find ourselves staring at this ominous screen of text: +(** After the period in the [Lemma] command, we are in %\index{interactive proof-editing mode}%_the interactive proof-editing mode_. We find ourselves staring at this ominous screen of text: [[ 1 subgoal @@ -223,7 +223,7 @@ Next in the output, we see our single subgoal described in full detail. There is a double-dashed line, above which would be our free variables and %\index{hypotheses}%hypotheses, if we had any. Below the line is the %\index{conclusion}%conclusion, which, in general, is to be proved from the hypotheses. -We manipulate the proof state by running commands called %\index{tactics}\emph{%#<i>#tactics#</i>#%}%. Let us start out by running one of the most important tactics:%\index{tactics!induction}% +We manipulate the proof state by running commands called %\index{tactics}%_tactics_. Let us start out by running one of the most important tactics:%\index{tactics!induction}% *) induction e. @@ -509,7 +509,7 @@ The [crush] tactic comes from the library associated with this book and is not part of the Coq standard library. The book's library contains a number of other tactics that are especially helpful in highly-automated proofs. -The %\index{Vernacular commands!Qed}%[Qed] command checks that the proof is finished and, if so, saves it. The tactic commands we have written above are an example of a %\emph{%#<i>#proof script#</i>#%}%, or a series of Ltac programs; while [Qed] uses the result of the script to generate a %\emph{%#<i>#proof term#</i>#%}%, a well-typed term of Gallina. To believe that a theorem is true, we only need to trust that the (relatively simple) checker for proof terms is correct; the use of proof scripts is immaterial. Part I of this book will introduce the principles behind encoding all proofs as terms of Gallina. +The %\index{Vernacular commands!Qed}%[Qed] command checks that the proof is finished and, if so, saves it. The tactic commands we have written above are an example of a _proof script_, or a series of Ltac programs; while [Qed] uses the result of the script to generate a _proof term_, a well-typed term of Gallina. To believe that a theorem is true, we only need to trust that the (relatively simple) checker for proof terms is correct; the use of proof scripts is immaterial. Part I of this book will introduce the principles behind encoding all proofs as terms of Gallina. The proof of our main theorem is now easy. We prove it with four period-terminated tactics, though separating them with semicolons would work as well; the version here is easier to step through.%\index{tactics!intros}% *) @@ -581,18 +581,18 @@ | TEq : forall t, tbinop t t Bool | TLt : tbinop Nat Nat Bool. -(** The definition of [tbinop] is different from [binop] in an important way. Where we declared that [binop] has type [Set], here we declare that [tbinop] has type [type -> type -> type -> Set]. We define [tbinop] as an %\emph{%#<i>#indexed type family#</i>#%}%. Indexed inductive types are at the heart of Coq's expressive power; almost everything else of interest is defined in terms of them. +(** The definition of [tbinop] is different from [binop] in an important way. Where we declared that [binop] has type [Set], here we declare that [tbinop] has type [type -> type -> type -> Set]. We define [tbinop] as an _indexed type family_. Indexed inductive types are at the heart of Coq's expressive power; almost everything else of interest is defined in terms of them. -The inuitive explanation of [tbinop] is that a [tbinop t1 t2 t] is a binary operator whose operands should have types [t1] and [t2], and whose result has type [t]. For instance, constructor [TLt] (for less-than comparison of numbers) is assigned type [tbinop Nat Nat Bool], meaning the operator's arguments are naturals and its result is boolean. The type of [TEq] introduces a small bit of additional complication via polymorphism: we want to allow equality comparison of any two values of any type, as long as they have the %\emph{%#<i>#same#</i>#%}% type. +The inuitive explanation of [tbinop] is that a [tbinop t1 t2 t] is a binary operator whose operands should have types [t1] and [t2], and whose result has type [t]. For instance, constructor [TLt] (for less-than comparison of numbers) is assigned type [tbinop Nat Nat Bool], meaning the operator's arguments are naturals and its result is boolean. The type of [TEq] introduces a small bit of additional complication via polymorphism: we want to allow equality comparison of any two values of any type, as long as they have the _same_ type. ML and Haskell have indexed algebraic datatypes. For instance, their list types are indexed by the type of data that the list carries. However, compared to Coq, ML and Haskell 98 place two important restrictions on datatype definitions. -First, the indices of the range of each data constructor must be type variables bound at the top level of the datatype definition. There is no way to do what we did here, where we, for instance, say that [TPlus] is a constructor building a [tbinop] whose indices are all fixed at [Nat]. %\index{generalized algebraic datatypes}\index{GADTs|see{generalized algebraic datatypes}}\emph{%#<i>#Generalized algebraic datatypes (GADTs)#</i>#%}~\cite{GADT}% are a popular feature in %\index{GHC Haskell}%GHC Haskell and other languages that removes this first restriction. +First, the indices of the range of each data constructor must be type variables bound at the top level of the datatype definition. There is no way to do what we did here, where we, for instance, say that [TPlus] is a constructor building a [tbinop] whose indices are all fixed at [Nat]. %\index{generalized algebraic datatypes}\index{GADTs|see{generalized algebraic datatypes}}%_Generalized algebraic datatypes (GADTs)_ %\cite{GADT}% are a popular feature in %\index{GHC Haskell}%GHC Haskell and other languages that removes this first restriction. -The second restriction is not lifted by GADTs. In ML and Haskell, indices of types must be types and may not be %\emph{%#<i>#expressions#</i>#%}%. In Coq, types may be indexed by arbitrary Gallina terms. Type indices can live in the same universe as programs, and we can compute with them just like regular programs. Haskell supports a hobbled form of computation in type indices based on %\index{Haskell}%multi-parameter type classes, and recent extensions like type functions bring Haskell programming even closer to %``%#"#real#"#%''% functional programming with types, but, without dependent typing, there must always be a gap between how one programs with types and how one programs normally. +The second restriction is not lifted by GADTs. In ML and Haskell, indices of types must be types and may not be _expressions_. In Coq, types may be indexed by arbitrary Gallina terms. Type indices can live in the same universe as programs, and we can compute with them just like regular programs. Haskell supports a hobbled form of computation in type indices based on %\index{Haskell}%multi-parameter type classes, and recent extensions like type functions bring Haskell programming even closer to %``%#"#real#"#%''% functional programming with types, but, without dependent typing, there must always be a gap between how one programs with types and how one programs normally. *) -(** We can define a similar type family for typed expressions, where a term of type [texp t] can be assigned object language type [t]. (It is conventional in the world of interactive theorem proving to call the language of the proof assistant the %\index{meta language}\emph{%#<i>#meta language#</i>#%}% and a language being formalized the %\index{object language}\emph{%#<i>#object language#</i>#%}%.) *) +(** We can define a similar type family for typed expressions, where a term of type [texp t] can be assigned object language type [t]. (It is conventional in the world of interactive theorem proving to call the language of the proof assistant the %\index{meta language}%_meta language_ and a language being formalized the %\index{object language}%_object language_.) *) Inductive texp : type -> Set := | TNConst : nat -> texp Nat @@ -619,7 +619,7 @@ | TLt => leb end. -(** This function has just a few differences from the denotation functions we saw earlier. First, [tbinop] is an indexed type, so its indices become additional arguments to [tbinopDenote]. Second, we need to perform a genuine %\index{dependent pattern matching}\emph{%#<i>#dependent pattern match#</i>#%}%, where the necessary %\emph{%#<i>#type#</i>#%}% of each case body depends on the %\emph{%#<i>#value#</i>#%}% that has been matched. At this early stage, we will not go into detail on the many subtle aspects of Gallina that support dependent pattern-matching, but the subject is central to Part II of the book. +(** This function has just a few differences from the denotation functions we saw earlier. First, [tbinop] is an indexed type, so its indices become additional arguments to [tbinopDenote]. Second, we need to perform a genuine %\index{dependent pattern matching}%_dependent pattern match_, where the necessary _type_ of each case body depends on the _value_ that has been matched. At this early stage, we will not go into detail on the many subtle aspects of Gallina that support dependent pattern-matching, but the subject is central to Part II of the book. The same tricks suffice to define an expression denotation function in an unsurprising way: *) @@ -755,7 +755,7 @@ (tconcat (tcompile e1 _) (TCons (TiBinop _ b) (TNil _))) end. -(** One interesting feature of the definition is the underscores appearing to the right of [=>] arrows. Haskell and ML programmers are quite familiar with compilers that infer type parameters to polymorphic values. In Coq, it is possible to go even further and ask the system to infer arbitrary terms, by writing underscores in place of specific values. You may have noticed that we have been calling functions without specifying all of their arguments. For instance, the recursive calls here to [tcompile] omit the [t] argument. Coq's %\emph{%#<i>#implicit argument#</i>#%}% mechanism automatically inserts underscores for arguments that it will probably be able to infer. Inference of such values is far from complete, though; generally, it only works in cases similar to those encountered with polymorphic type instantiation in Haskell and ML. +(** One interesting feature of the definition is the underscores appearing to the right of [=>] arrows. Haskell and ML programmers are quite familiar with compilers that infer type parameters to polymorphic values. In Coq, it is possible to go even further and ask the system to infer arbitrary terms, by writing underscores in place of specific values. You may have noticed that we have been calling functions without specifying all of their arguments. For instance, the recursive calls here to [tcompile] omit the [t] argument. Coq's _implicit argument_ mechanism automatically inserts underscores for arguments that it will probably be able to infer. Inference of such values is far from complete, though; generally, it only works in cases similar to those encountered with polymorphic type instantiation in Haskell and ML. The underscores here are being filled in with stack types. That is, the Coq type inferencer is, in a sense, inferring something about the flow of control in the translated programs. We can take a look at exactly which values are filled in: *) @@ -852,7 +852,7 @@ (* end hide *) (** %\noindent%[Hint] %\coqdockw{%#<tt>#Rewrite#</tt>#%}% [tconcat_correct.] *) -(** Here we meet the pervasive concept of a %\emph{%#<i>#hint#</i>#%}%. Many proofs can be found through exhaustive enumerations of combinations of possible proof steps; hints provide the set of steps to consider. The tactic [crush] is applying such brute force search for us silently, and it will consider more possibilities as we add more hints. This particular hint asks that the lemma be used for left-to-right rewriting, and we ask for the hint to be added to the hint database called [cpdt], which is the database used by [crush]. In general, segragating hints into different databases is helpful to control the performance of proof search, in cases where domain knowledge allows us to narrow the set of proof steps to be considered in brute force search. Part III of this book considers such pragmatic aspects of proof search in much more detail. +(** Here we meet the pervasive concept of a _hint_. Many proofs can be found through exhaustive enumerations of combinations of possible proof steps; hints provide the set of steps to consider. The tactic [crush] is applying such brute force search for us silently, and it will consider more possibilities as we add more hints. This particular hint asks that the lemma be used for left-to-right rewriting, and we ask for the hint to be added to the hint database called [cpdt], which is the database used by [crush]. In general, segragating hints into different databases is helpful to control the performance of proof search, in cases where domain knowledge allows us to narrow the set of proof steps to be considered in brute force search. Part III of this book considers such pragmatic aspects of proof search in much more detail. Now we are ready to return to [tcompile_correct'], proving it automatically this time. *) @@ -874,7 +874,7 @@ Qed. (* end thide *) -(** It is probably worth emphasizing that we are doing more than building mathematical models. Our compilers are functional programs that can be executed efficiently. One strategy for doing so is based on %\index{program extraction}\emph{%#<i>#program extraction#</i>#%}%, which generates OCaml code from Coq developments. For instance, we run a command to output the OCaml version of [tcompile]:%\index{Vernacular commands!Extraction}% *) +(** It is probably worth emphasizing that we are doing more than building mathematical models. Our compilers are functional programs that can be executed efficiently. One strategy for doing so is based on %\index{program extraction}%_program extraction_, which generates OCaml code from Coq developments. For instance, we run a command to output the OCaml version of [tcompile]:%\index{Vernacular commands!Extraction}% *) (* begin hide *) Extraction tcompile.