comparison src/StackMachine.v @ 292:2c88fc1dbe33

A pass of double-quotes and LaTeX operator beautification
author Adam Chlipala <adam@chlipala.net>
date Wed, 10 Nov 2010 16:31:04 -0500
parents fabbc71abd80
children 1b6c81e51799
comparison
equal deleted inserted replaced
291:da9ccc6bf572 292:2c88fc1dbe33
69 69
70 A note for readers following along in the PDF version: coqdoc supports pretty-printing of tokens in LaTeX or HTML. Where you see a right arrow character, the source contains the ASCII text %\texttt{%#<tt>#->#</tt>#%}%. Other examples of this substitution appearing in this chapter are a double right arrow for %\texttt{%#<tt>#=>#</tt>#%}%, the inverted 'A' symbol for %\texttt{%#<tt>#forall#</tt>#%}%, and the Cartesian product 'X' for %\texttt{%#<tt>#*#</tt>#%}%. When in doubt about the ASCII version of a symbol, you can consult the chapter source code. 70 A note for readers following along in the PDF version: coqdoc supports pretty-printing of tokens in LaTeX or HTML. Where you see a right arrow character, the source contains the ASCII text %\texttt{%#<tt>#->#</tt>#%}%. Other examples of this substitution appearing in this chapter are a double right arrow for %\texttt{%#<tt>#=>#</tt>#%}%, the inverted 'A' symbol for %\texttt{%#<tt>#forall#</tt>#%}%, and the Cartesian product 'X' for %\texttt{%#<tt>#*#</tt>#%}%. When in doubt about the ASCII version of a symbol, you can consult the chapter source code.
71 71
72 %\medskip% 72 %\medskip%
73 73
74 Now we are ready to say what these programs mean. We will do this by writing an interpreter that can be thought of as a trivial operational or denotational semantics. (If you are not familiar with these semantic techniques, no need to worry; we will stick to "common sense" constructions.) *) 74 Now we are ready to say what these programs mean. We will do this by writing an interpreter that can be thought of as a trivial operational or denotational semantics. (If you are not familiar with these semantic techniques, no need to worry; we will stick to %``%#"#common sense#"#%''% constructions.) *)
75 75
76 Definition binopDenote (b : binop) : nat -> nat -> nat := 76 Definition binopDenote (b : binop) : nat -> nat -> nat :=
77 match b with 77 match b with
78 | Plus => plus 78 | Plus => plus
79 | Times => mult 79 | Times => mult
99 | Times => mult 99 | Times => mult
100 end. 100 end.
101 101
102 ]] 102 ]]
103 103
104 Languages like Haskell and ML have a convenient %\textit{%#<i>#principal typing#</i>#%}% property, which gives us strong guarantees about how effective type inference will be. Unfortunately, Coq's type system is so expressive that any kind of "complete" type inference is impossible, and the task even seems to be hard heuristically in practice. Nonetheless, Coq includes some very helpful heuristics, many of them copying the workings of Haskell and ML type-checkers for programs that fall in simple fragments of Coq's language. 104 Languages like Haskell and ML have a convenient %\textit{%#<i>#principal typing#</i>#%}% property, which gives us strong guarantees about how effective type inference will be. Unfortunately, Coq's type system is so expressive that any kind of %``%#"#complete#"#%''% type inference is impossible, and the task even seems to be hard heuristically in practice. Nonetheless, Coq includes some very helpful heuristics, many of them copying the workings of Haskell and ML type-checkers for programs that fall in simple fragments of Coq's language.
105 105
106 This is as good a time as any to mention the preponderance of different languages associated with Coq. The theoretical foundation of Coq is a formal system called the %\textit{%#<i>#Calculus of Inductive Constructions (CIC)#</i>#%}%, which is an extension of the older %\textit{%#<i>#Calculus of Constructions (CoC)#</i>#%}%. CIC is quite a spartan foundation, which is helpful for proving metatheory but not so helpful for real development. Still, it is nice to know that it has been proved that CIC enjoys properties like %\textit{%#<i>#strong normalization#</i>#%}%, meaning that every program (and, more importantly, every proof term) terminates; and %\textit{%#<i>#relative consistency#</i>#%}% with systems like versions of Zermelo Fraenkel set theory, which roughly means that you can believe that Coq proofs mean that the corresponding propositions are "really true," if you believe in set theory. 106 This is as good a time as any to mention the preponderance of different languages associated with Coq. The theoretical foundation of Coq is a formal system called the %\textit{%#<i>#Calculus of Inductive Constructions (CIC)#</i>#%}%, which is an extension of the older %\textit{%#<i>#Calculus of Constructions (CoC)#</i>#%}%. CIC is quite a spartan foundation, which is helpful for proving metatheory but not so helpful for real development. Still, it is nice to know that it has been proved that CIC enjoys properties like %\textit{%#<i>#strong normalization#</i>#%}%, meaning that every program (and, more importantly, every proof term) terminates; and %\textit{%#<i>#relative consistency#</i>#%}% with systems like versions of Zermelo Fraenkel set theory, which roughly means that you can believe that Coq proofs mean that the corresponding propositions are %``%#"#really true,#"#%''% if you believe in set theory.
107 107
108 Coq is actually based on an extension of CIC called %\textit{%#<i>#Gallina#</i>#%}%. The text after the [:=] and before the period in the last code example is a term of Gallina. Gallina adds many useful features that are not compiled internally to more primitive CIC features. The important metatheorems about CIC have not been extended to the full breadth of these features, but most Coq users do not seem to lose much sleep over this omission. 108 Coq is actually based on an extension of CIC called %\textit{%#<i>#Gallina#</i>#%}%. The text after the [:=] and before the period in the last code example is a term of Gallina. Gallina adds many useful features that are not compiled internally to more primitive CIC features. The important metatheorems about CIC have not been extended to the full breadth of these features, but most Coq users do not seem to lose much sleep over this omission.
109 109
110 Commands like [Inductive] and [Definition] are part of %\textit{%#<i>#the vernacular#</i>#%}%, which includes all sorts of useful queries and requests to the Coq system. 110 Commands like [Inductive] and [Definition] are part of %\textit{%#<i>#the vernacular#</i>#%}%, which includes all sorts of useful queries and requests to the Coq system.
111 111
145 Definition prog := list instr. 145 Definition prog := list instr.
146 Definition stack := list nat. 146 Definition stack := list nat.
147 147
148 (** An instruction either pushes a constant onto the stack or pops two arguments, applies a binary operator to them, and pushes the result onto the stack. A program is a list of instructions, and a stack is a list of natural numbers. 148 (** An instruction either pushes a constant onto the stack or pops two arguments, applies a binary operator to them, and pushes the result onto the stack. A program is a list of instructions, and a stack is a list of natural numbers.
149 149
150 We can give instructions meanings as functions from stacks to optional stacks, where running an instruction results in [None] in case of a stack underflow and results in [Some s'] when the result of execution is the new stack [s']. [::] is the "list cons" operator from the Coq standard library. *) 150 We can give instructions meanings as functions from stacks to optional stacks, where running an instruction results in [None] in case of a stack underflow and results in [Some s'] when the result of execution is the new stack [s']. [::] is the %``%#"#list cons#"#%''% operator from the Coq standard library. *)
151 151
152 Definition instrDenote (i : instr) (s : stack) : option stack := 152 Definition instrDenote (i : instr) (s : stack) : option stack :=
153 match i with 153 match i with
154 | IConst n => Some (n :: s) 154 | IConst n => Some (n :: s)
155 | IBinop b => 155 | IBinop b =>
231 (* begin hide *) 231 (* begin hide *)
232 Abort. 232 Abort.
233 (* end hide *) 233 (* end hide *)
234 (* begin thide *) 234 (* begin thide *)
235 235
236 (** Though a pencil-and-paper proof might clock out at this point, writing "by a routine induction on [e]," it turns out not to make sense to attack this proof directly. We need to use the standard trick of %\textit{%#<i>#strengthening the induction hypothesis#</i>#%}%. We do that by proving an auxiliary lemma: 236 (** Though a pencil-and-paper proof might clock out at this point, writing %``%#"#by a routine induction on [e],#"#%''% it turns out not to make sense to attack this proof directly. We need to use the standard trick of %\textit{%#<i>#strengthening the induction hypothesis#</i>#%}%. We do that by proving an auxiliary lemma:
237 *) 237 *)
238 238
239 Lemma compile_correct' : forall e p s, 239 Lemma compile_correct' : forall e p s,
240 progDenote (compile e ++ p) s = progDenote p (expDenote e :: s). 240 progDenote (compile e ++ p) s = progDenote p (expDenote e :: s).
241 241
583 583
584 ML and Haskell have indexed algebraic datatypes. For instance, their list types are indexed by the type of data that the list carries. However, compared to Coq, ML and Haskell 98 place two important restrictions on datatype definitions. 584 ML and Haskell have indexed algebraic datatypes. For instance, their list types are indexed by the type of data that the list carries. However, compared to Coq, ML and Haskell 98 place two important restrictions on datatype definitions.
585 585
586 First, the indices of the range of each data constructor must be type variables bound at the top level of the datatype definition. There is no way to do what we did here, where we, for instance, say that [TPlus] is a constructor building a [tbinop] whose indices are all fixed at [Nat]. %\textit{%#<i>#Generalized algebraic datatypes (GADTs)#</i>#%}% are a popular feature in GHC Haskell and other languages that removes this first restriction. 586 First, the indices of the range of each data constructor must be type variables bound at the top level of the datatype definition. There is no way to do what we did here, where we, for instance, say that [TPlus] is a constructor building a [tbinop] whose indices are all fixed at [Nat]. %\textit{%#<i>#Generalized algebraic datatypes (GADTs)#</i>#%}% are a popular feature in GHC Haskell and other languages that removes this first restriction.
587 587
588 The second restriction is not lifted by GADTs. In ML and Haskell, indices of types must be types and may not be %\textit{%#<i>#expressions#</i>#%}%. In Coq, types may be indexed by arbitrary Gallina terms. Type indices can live in the same universe as programs, and we can compute with them just like regular programs. Haskell supports a hobbled form of computation in type indices based on multi-parameter type classes, and recent extensions like type functions bring Haskell programming even closer to "real" functional programming with types, but, without dependent typing, there must always be a gap between how one programs with types and how one programs normally. 588 The second restriction is not lifted by GADTs. In ML and Haskell, indices of types must be types and may not be %\textit{%#<i>#expressions#</i>#%}%. In Coq, types may be indexed by arbitrary Gallina terms. Type indices can live in the same universe as programs, and we can compute with them just like regular programs. Haskell supports a hobbled form of computation in type indices based on multi-parameter type classes, and recent extensions like type functions bring Haskell programming even closer to %``%#"#real#"#%''% functional programming with types, but, without dependent typing, there must always be a gap between how one programs with types and how one programs normally.
589 *) 589 *)
590 590
591 (** We can define a similar type family for typed expressions. *) 591 (** We can define a similar type family for typed expressions. *)
592 592
593 Inductive texp : type -> Set := 593 Inductive texp : type -> Set :=
603 | Bool => bool 603 | Bool => bool
604 end. 604 end.
605 605
606 (** It can take a few moments to come to terms with the fact that [Set], the type of types of programs, is itself a first-class type, and that we can write functions that return [Set]s. Past that wrinkle, the definition of [typeDenote] is trivial, relying on the [nat] and [bool] types from the Coq standard library. 606 (** It can take a few moments to come to terms with the fact that [Set], the type of types of programs, is itself a first-class type, and that we can write functions that return [Set]s. Past that wrinkle, the definition of [typeDenote] is trivial, relying on the [nat] and [bool] types from the Coq standard library.
607 607
608 We need to define one auxiliary function, implementing a boolean binary "less-than" operator, which only appears in the standard library with a type fancier than what we are prepared to deal with here. The code is entirely standard and ML-like, with the one caveat being that the Coq [nat] type uses a unary representation, where [O] is zero and [S n] is the successor of [n]. 608 We need to define one auxiliary function, implementing a boolean binary %``%#"#less-than#"#%''% operator, which only appears in the standard library with a type fancier than what we are prepared to deal with here. The code is entirely standard and ML-like, with the one caveat being that the Coq [nat] type uses a unary representation, where [O] is zero and [S n] is the successor of [n].
609 *) 609 *)
610 610
611 Fixpoint lessThan (n1 n2 : nat) : bool := 611 Fixpoint lessThan (n1 n2 : nat) : bool :=
612 match n1, n2 with 612 match n1, n2 with
613 | O, S _ => true 613 | O, S _ => true
676 (** [= true : typeDenote Bool] *) 676 (** [= true : typeDenote Bool] *)
677 677
678 678
679 (** ** Target Language *) 679 (** ** Target Language *)
680 680
681 (** Now we want to define a suitable stack machine target for compilation. In the example of the untyped language, stack machine programs could encounter stack underflows and "get stuck." This was unfortunate, since we had to deal with this complication even though we proved that our compiler never produced underflowing programs. We could have used dependent types to force all stack machine programs to be underflow-free. 681 (** Now we want to define a suitable stack machine target for compilation. In the example of the untyped language, stack machine programs could encounter stack underflows and %``%#"#get stuck.#"#%''% This was unfortunate, since we had to deal with this complication even though we proved that our compiler never produced underflowing programs. We could have used dependent types to force all stack machine programs to be underflow-free.
682 682
683 For our new languages, besides underflow, we also have the problem of stack slots with naturals instead of bools or vice versa. This time, we will use indexed typed families to avoid the need to reason about potential failures. 683 For our new languages, besides underflow, we also have the problem of stack slots with naturals instead of bools or vice versa. This time, we will use indexed typed families to avoid the need to reason about potential failures.
684 684
685 We start by defining stack types, which classify sets of possible stacks. *) 685 We start by defining stack types, which classify sets of possible stacks. *)
686 686
862 (** Again, we need to strengthen the theorem statement so that the induction will go through. This time, I will develop an alternative approach to this kind of proof, stating the key lemma as: *) 862 (** Again, we need to strengthen the theorem statement so that the induction will go through. This time, I will develop an alternative approach to this kind of proof, stating the key lemma as: *)
863 863
864 Lemma tcompile_correct' : forall t (e : texp t) ts (s : vstack ts), 864 Lemma tcompile_correct' : forall t (e : texp t) ts (s : vstack ts),
865 tprogDenote (tcompile e ts) s = (texpDenote e, s). 865 tprogDenote (tcompile e ts) s = (texpDenote e, s).
866 866
867 (** While lemma [compile_correct'] quantified over a program that is the "continuation" for the expression we are considering, here we avoid drawing in any extra syntactic elements. In addition to the source expression and its type, we also quantify over an initial stack type and a stack compatible with it. Running the compilation of the program starting from that stack, we should arrive at a stack that differs only in having the program's denotation pushed onto it. 867 (** While lemma [compile_correct'] quantified over a program that is the %``%#"#continuation#"#%''% for the expression we are considering, here we avoid drawing in any extra syntactic elements. In addition to the source expression and its type, we also quantify over an initial stack type and a stack compatible with it. Running the compilation of the program starting from that stack, we should arrive at a stack that differs only in having the program's denotation pushed onto it.
868 868
869 Let us try to prove this theorem in the same way that we settled on in the last section. *) 869 Let us try to prove this theorem in the same way that we settled on in the last section. *)
870 870
871 induction e; crush. 871 induction e; crush.
872 872