comparison src/Intro.v @ 292:2c88fc1dbe33

A pass of double-quotes and LaTeX operator beautification
author Adam Chlipala <adam@chlipala.net>
date Wed, 10 Nov 2010 16:31:04 -0500
parents caa69851c78d
children 25ec92be5ad2
comparison
equal deleted inserted replaced
291:da9ccc6bf572 292:2c88fc1dbe33
47 47
48 (** 48 (**
49 49
50 We would all like to have programs check that our programs are correct. Due in no small part to some bold but unfulfilled promises in the history of computer science, today most people who write software, practitioners and academics alike, assume that the costs of formal program verification outweigh the benefits. The purpose of this book is to convince you that the technology of program verification is mature enough today that it makes sense to use it in a support role in many kinds of research projects in computer science. Beyond the convincing, I also want to provide a handbook on practical engineering of certified programs with the Coq proof assistant. 50 We would all like to have programs check that our programs are correct. Due in no small part to some bold but unfulfilled promises in the history of computer science, today most people who write software, practitioners and academics alike, assume that the costs of formal program verification outweigh the benefits. The purpose of this book is to convince you that the technology of program verification is mature enough today that it makes sense to use it in a support role in many kinds of research projects in computer science. Beyond the convincing, I also want to provide a handbook on practical engineering of certified programs with the Coq proof assistant.
51 51
52 There are a good number of (though definitely not "many") tools that are in wide use today for building machine-checked mathematical proofs and machine-certified programs. This is my attempt at an exhaustive list of interactive "proof assistants" satisfying a few criteria. First, the authors of each tool must intend for it to be put to use for software-related applications. Second, there must have been enough engineering effort put into the tool that someone not doing research on the tool itself would feel his time was well spent using it. A third criterion is more of an empirical validation of the second: the tool must have a significant user community outside of its own development team. 52 There are a good number of (though definitely not %``%#"#many#"#%''%) tools that are in wide use today for building machine-checked mathematical proofs and machine-certified programs. This is my attempt at an exhaustive list of interactive %``%#"#proof assistants#"#%''% satisfying a few criteria. First, the authors of each tool must intend for it to be put to use for software-related applications. Second, there must have been enough engineering effort put into the tool that someone not doing research on the tool itself would feel his time was well spent using it. A third criterion is more of an empirical validation of the second: the tool must have a significant user community outside of its own development team.
53 53
54 % 54 %
55 \medskip 55 \medskip
56 56
57 \begin{tabular}{rl} 57 \begin{tabular}{rl}
72 <tr><td align="right"><b>PVS</b></td> <td><a href="http://pvs.csl.sri.com/">http://pvs.csl.sri.com/</a></td></tr> 72 <tr><td align="right"><b>PVS</b></td> <td><a href="http://pvs.csl.sri.com/">http://pvs.csl.sri.com/</a></td></tr>
73 <tr><td align="right"><b>Twelf</b></td> <td><a href="http://www.twelf.org/">http://www.twelf.org/</a></td></tr> 73 <tr><td align="right"><b>Twelf</b></td> <td><a href="http://www.twelf.org/">http://www.twelf.org/</a></td></tr>
74 </table> 74 </table>
75 # 75 #
76 76
77 Isabelle/HOL, implemented with the "proof assistant development framework" Isabelle, is the most popular proof assistant for the HOL logic. The other implementations of HOL can be considered equivalent for purposes of the discussion here. 77 Isabelle/HOL, implemented with the %``%#"#proof assistant development framework#"#%''% Isabelle, is the most popular proof assistant for the HOL logic. The other implementations of HOL can be considered equivalent for purposes of the discussion here.
78 78
79 *) 79 *)
80 80
81 81
82 (** * Why Coq? *) 82 (** * Why Coq? *)
109 *) 109 *)
110 110
111 (** ** An Easy-to-Check Kernel Proof Language *) 111 (** ** An Easy-to-Check Kernel Proof Language *)
112 112
113 (** 113 (**
114 Scores of automated decision procedures are useful in practical theorem proving, but it is unfortunate to have to trust in the correct implementation of each procedure. Proof assistants satisfying the "de Bruijn criterion" may use complicated and extensible procedures to seek out proofs, but in the end they produce %\textit{%#<i>#proof terms#</i>#%}% in kernel languages. These core languages have feature complexity on par with what you find in proposals for formal foundations for mathematics. To believe a proof, we can ignore the possibility of bugs during %\textit{%#<i>#search#</i>#%}% and just rely on a (relatively small) proof-checking kernel that we apply to the %\textit{%#<i>#result#</i>#%}% of the search. 114 Scores of automated decision procedures are useful in practical theorem proving, but it is unfortunate to have to trust in the correct implementation of each procedure. Proof assistants satisfying the %``%#"#de Bruijn criterion#"#%''% may use complicated and extensible procedures to seek out proofs, but in the end they produce %\textit{%#<i>#proof terms#</i>#%}% in kernel languages. These core languages have feature complexity on par with what you find in proposals for formal foundations for mathematics. To believe a proof, we can ignore the possibility of bugs during %\textit{%#<i>#search#</i>#%}% and just rely on a (relatively small) proof-checking kernel that we apply to the %\textit{%#<i>#result#</i>#%}% of the search.
115 115
116 ACL2 and PVS do not meet the de Bruijn criterion, employing fancy decision procedures that produce no "evidence trails" justifying their results. 116 ACL2 and PVS do not meet the de Bruijn criterion, employing fancy decision procedures that produce no %``%#"#evidence trails#"#%''% justifying their results.
117 *) 117 *)
118 118
119 (** ** Convenient Programmable Proof Automation *) 119 (** ** Convenient Programmable Proof Automation *)
120 120
121 (** 121 (**
123 123
124 Twelf features no proof automation marked as a bonafide part of the latest release; there is some automation code included for testing purposes. The Twelf style is based on writing out all proofs in full detail. Because Twelf is specialized to the domain of syntactic metatheory proofs about programming languages and logics, it is feasible to use it to write those kinds of proofs manually. Outside that domain, the lack of automation can be a serious obstacle to productivity. Most kinds of program verification fall outside Twelf's forte. 124 Twelf features no proof automation marked as a bonafide part of the latest release; there is some automation code included for testing purposes. The Twelf style is based on writing out all proofs in full detail. Because Twelf is specialized to the domain of syntactic metatheory proofs about programming languages and logics, it is feasible to use it to write those kinds of proofs manually. Outside that domain, the lack of automation can be a serious obstacle to productivity. Most kinds of program verification fall outside Twelf's forte.
125 125
126 Of the remaining tools, all can support user extension with new decision procedures by hacking directly in the tool's implementation language (such as OCaml for Coq). Since ACL2 and PVS do not satisfy the de Bruijn criterion, overall correctness is at the mercy of the authors of new procedures. 126 Of the remaining tools, all can support user extension with new decision procedures by hacking directly in the tool's implementation language (such as OCaml for Coq). Since ACL2 and PVS do not satisfy the de Bruijn criterion, overall correctness is at the mercy of the authors of new procedures.
127 127
128 Isabelle/HOL and Coq both support coding new proof manipulations in ML in ways that cannot lead to the acceptance of invalid proofs. Additionally, Coq includes a domain-specific language for coding decision procedures in normal Coq source code, with no need to break out into ML. This language is called Ltac, and I think of it as the unsung hero of the proof assistant world. Not only does Ltac prevent you from making fatal mistakes, it also includes a number of novel programming constructs which combine to make a "proof by decision procedure" style very pleasant. We will meet these features in the chapters to come. 128 Isabelle/HOL and Coq both support coding new proof manipulations in ML in ways that cannot lead to the acceptance of invalid proofs. Additionally, Coq includes a domain-specific language for coding decision procedures in normal Coq source code, with no need to break out into ML. This language is called Ltac, and I think of it as the unsung hero of the proof assistant world. Not only does Ltac prevent you from making fatal mistakes, it also includes a number of novel programming constructs which combine to make a %``%#"#proof by decision procedure#"#%''% style very pleasant. We will meet these features in the chapters to come.
129 *) 129 *)
130 130
131 (** ** Proof by Reflection *) 131 (** ** Proof by Reflection *)
132 132
133 (** 133 (**
134 A surprising wealth of benefits follow from choosing a proof language that integrates a rich notion of computation. Coq includes programs and proof terms in the same syntactic class. This makes it easy to write programs that compute proofs. With rich enough dependent types, such programs are %\textit{%#<i>#certified decision procedures#</i>#%}%. In such cases, these certified procedures can be put to good use %\textit{%#<i>#without ever running them#</i>#%}%! Their types guarantee that, if we did bother to run them, we would receive proper "ground" proofs. 134 A surprising wealth of benefits follow from choosing a proof language that integrates a rich notion of computation. Coq includes programs and proof terms in the same syntactic class. This makes it easy to write programs that compute proofs. With rich enough dependent types, such programs are %\textit{%#<i>#certified decision procedures#</i>#%}%. In such cases, these certified procedures can be put to good use %\textit{%#<i>#without ever running them#</i>#%}%! Their types guarantee that, if we did bother to run them, we would receive proper %``%#"#ground#"#%''% proofs.
135 135
136 The critical ingredient for this technique, many of whose instances are referred to as %\textit{%#<i>#proof by reflection#</i>#%}%, is a way of inducing non-trivial computation inside of logical propositions during proof checking. Further, most of these instances require dependent types to make it possible to state the appropriate theorems. Of the proof assistants I listed, only Coq really provides this support. 136 The critical ingredient for this technique, many of whose instances are referred to as %\textit{%#<i>#proof by reflection#</i>#%}%, is a way of inducing non-trivial computation inside of logical propositions during proof checking. Further, most of these instances require dependent types to make it possible to state the appropriate theorems. Of the proof assistants I listed, only Coq really provides this support.
137 *) 137 *)
138 138
139 139
142 (** 142 (**
143 The logic and programming language behind Coq belongs to a type-theory ecosystem with a good number of other thriving members. %Agda\footnote{\url{http://appserv.cs.chalmers.se/users/ulfn/wiki/agda.php}}%#<a href="http://appserv.cs.chalmers.se/users/ulfn/wiki/agda.php">Agda</a># and %Epigram\footnote{\url{http://www.e-pig.org/}}%#<a href="http://www.e-pig.org/">Epigram</a># are the most developed tools among the alternatives to Coq, and there are others that are earlier in their lifecycles. All of the languages in this family feel sort of like different historical offshoots of Latin. The hardest conceptual epiphanies are, for the most part, portable among all the languages. Given this, why choose Coq for certified programming? 143 The logic and programming language behind Coq belongs to a type-theory ecosystem with a good number of other thriving members. %Agda\footnote{\url{http://appserv.cs.chalmers.se/users/ulfn/wiki/agda.php}}%#<a href="http://appserv.cs.chalmers.se/users/ulfn/wiki/agda.php">Agda</a># and %Epigram\footnote{\url{http://www.e-pig.org/}}%#<a href="http://www.e-pig.org/">Epigram</a># are the most developed tools among the alternatives to Coq, and there are others that are earlier in their lifecycles. All of the languages in this family feel sort of like different historical offshoots of Latin. The hardest conceptual epiphanies are, for the most part, portable among all the languages. Given this, why choose Coq for certified programming?
144 144
145 I think the answer is simple. None of the competition has well-developed systems for tactic-based theorem proving. Agda and Epigram are designed and marketed more as programming languages than proof assistants. Dependent types are great, because they often help you prove deep theorems without doing anything that feels like proving. Nonetheless, almost any interesting certified programming project will benefit from some activity that deserves to be called proving, and many interesting projects absolutely require semi-automated proving, if the sanity of the programmer is to be safeguarded. Informally, proving is unavoidable when any correctness proof for a program has a structure that does not mirror the structure of the program itself. An example is a compiler correctness proof, which probably proceeds by induction on program execution traces, which have no simple relationship with the structure of the compiler or the structure of the programs it compiles. In building such proofs, a mature system for scripted proof automation is invaluable. 145 I think the answer is simple. None of the competition has well-developed systems for tactic-based theorem proving. Agda and Epigram are designed and marketed more as programming languages than proof assistants. Dependent types are great, because they often help you prove deep theorems without doing anything that feels like proving. Nonetheless, almost any interesting certified programming project will benefit from some activity that deserves to be called proving, and many interesting projects absolutely require semi-automated proving, if the sanity of the programmer is to be safeguarded. Informally, proving is unavoidable when any correctness proof for a program has a structure that does not mirror the structure of the program itself. An example is a compiler correctness proof, which probably proceeds by induction on program execution traces, which have no simple relationship with the structure of the compiler or the structure of the programs it compiles. In building such proofs, a mature system for scripted proof automation is invaluable.
146 146
147 On the other hand, Agda, Epigram, and similar tools have less implementation baggage associated with them, and so they tend to be the default first homes of innovations in practical type theory. Some significant kinds of dependently-typed programs are much easier to write in Agda and Epigram than in Coq. The former tools may very well be superior choices for projects that do not involve any "proving." Anecdotally, I have gotten the impression that manual proving is orders of magnitudes more costly than manual coping with Coq's lack of programming bells and whistles. In this book, I will devote significant time to patterns for programming with dependent types in Coq as it is today. We can hope that the type theory community is tending towards convergence on the right set of features for practical programming with dependent types, and that we will eventually have a single tool embodying those features. 147 On the other hand, Agda, Epigram, and similar tools have less implementation baggage associated with them, and so they tend to be the default first homes of innovations in practical type theory. Some significant kinds of dependently-typed programs are much easier to write in Agda and Epigram than in Coq. The former tools may very well be superior choices for projects that do not involve any %``%#"#proving.#"#%''% Anecdotally, I have gotten the impression that manual proving is orders of magnitudes more costly than manual coping with Coq's lack of programming bells and whistles. In this book, I will devote significant time to patterns for programming with dependent types in Coq as it is today. We can hope that the type theory community is tending towards convergence on the right set of features for practical programming with dependent types, and that we will eventually have a single tool embodying those features.
148 *) 148 *)
149 149
150 150
151 (** * Engineering with a Proof Assistant *) 151 (** * Engineering with a Proof Assistant *)
152 152
153 (** 153 (**
154 In comparisons with its competitors, Coq is often derided for promoting unreadable proofs. It is very easy to write proof scripts that manipulate proof goals imperatively, with no structure to aid readers. Such developments are nightmares to maintain, and they certainly do not manage to convey "why the theorem is true" to anyone but the original author. One additional (and not insignificant) purpose of this book is to show why it is unfair and unproductive to dismiss Coq based on the existence of such developments. 154 In comparisons with its competitors, Coq is often derided for promoting unreadable proofs. It is very easy to write proof scripts that manipulate proof goals imperatively, with no structure to aid readers. Such developments are nightmares to maintain, and they certainly do not manage to convey %``%#"#why the theorem is true#"#%''% to anyone but the original author. One additional (and not insignificant) purpose of this book is to show why it is unfair and unproductive to dismiss Coq based on the existence of such developments.
155 155
156 I will go out on a limb and guess that the reader is a dedicated fan of some functional programming language or another, and that he may even have been involved in teaching that language to undergraduates. I want to propose an analogy between two attitudes: coming to a negative conclusion about Coq after reading common Coq developments in the wild, and coming to a negative conclusion about Your Favorite Language after looking at the programs undergraduates write in it in the first week of class. The pragmatics of mechanized proving and program verification have been under serious study for much less time than the pragmatics of programming have been. The computer theorem proving community is still developing the key insights that correspond to those that functional programming texts and instructors impart to their students, to help those students get over that critical hump where using the language stops being more trouble than it is worth. Most of the insights for Coq are barely even disseminated among the experts, let alone set down in a tutorial form. I hope to use this book to go a long way towards remedying that. 156 I will go out on a limb and guess that the reader is a dedicated fan of some functional programming language or another, and that he may even have been involved in teaching that language to undergraduates. I want to propose an analogy between two attitudes: coming to a negative conclusion about Coq after reading common Coq developments in the wild, and coming to a negative conclusion about Your Favorite Language after looking at the programs undergraduates write in it in the first week of class. The pragmatics of mechanized proving and program verification have been under serious study for much less time than the pragmatics of programming have been. The computer theorem proving community is still developing the key insights that correspond to those that functional programming texts and instructors impart to their students, to help those students get over that critical hump where using the language stops being more trouble than it is worth. Most of the insights for Coq are barely even disseminated among the experts, let alone set down in a tutorial form. I hope to use this book to go a long way towards remedying that.
157 157
158 If I do that job well, then this book should be of interest even to people who have participated in classes or tutorials specifically about Coq. The book should even be useful to people who have been using Coq for years but who are mystified when their Coq developments prove impenetrable by colleagues. The crucial angle in this book is that there are "design patterns" for reliably avoiding the really grungy parts of theorem proving, and consistent use of these patterns can get you over the hump to the point where it is worth your while to use Coq to prove your theorems and certify your programs, even if formal verification is not your main concern in a project. We will follow this theme by pursuing two main methods for replacing manual proofs with more understandable artifacts: dependently-typed functions and custom Ltac decision procedures. 158 If I do that job well, then this book should be of interest even to people who have participated in classes or tutorials specifically about Coq. The book should even be useful to people who have been using Coq for years but who are mystified when their Coq developments prove impenetrable by colleagues. The crucial angle in this book is that there are %``%#"#design patterns#"#%''% for reliably avoiding the really grungy parts of theorem proving, and consistent use of these patterns can get you over the hump to the point where it is worth your while to use Coq to prove your theorems and certify your programs, even if formal verification is not your main concern in a project. We will follow this theme by pursuing two main methods for replacing manual proofs with more understandable artifacts: dependently-typed functions and custom Ltac decision procedures.
159 *) 159 *)
160 160
161 161
162 (** * Prerequisites *) 162 (** * Prerequisites *)
163 163