adamc@235
|
1 (* Copyright (c) 2009, Adam Chlipala
|
adamc@235
|
2 *
|
adamc@235
|
3 * This work is licensed under a
|
adamc@235
|
4 * Creative Commons Attribution-Noncommercial-No Derivative Works 3.0
|
adamc@235
|
5 * Unported License.
|
adamc@235
|
6 * The license text is available at:
|
adamc@235
|
7 * http://creativecommons.org/licenses/by-nc-nd/3.0/
|
adamc@235
|
8 *)
|
adamc@235
|
9
|
adamc@235
|
10 (* begin hide *)
|
adamc@236
|
11 Require Import Arith.
|
adamc@236
|
12
|
adamc@235
|
13 Require Import Tactics.
|
adamc@235
|
14
|
adamc@235
|
15 Set Implicit Arguments.
|
adamc@235
|
16 (* end hide *)
|
adamc@235
|
17
|
adamc@235
|
18
|
adamc@235
|
19 (** %\chapter{Proving in the Large}% *)
|
adamc@235
|
20
|
adamc@236
|
21 (** It is somewhat unfortunate that the term "theorem-proving" looks so much like the word "theory." Most researchers and practitioners in software assume that mechanized theorem-proving is profoundly impractical. Indeed, until recently, most advances in theorem-proving for higher-order logics have been largely theoretical. However, starting around the beginning of the 21st century, there was a surge in the use of proof assistants in serious verification efforts. That line of work is still quite new, but I believe it is not too soon to distill some lessons on how to work effectively with large formal proofs.
|
adamc@236
|
22
|
adamc@236
|
23 Thus, this chapter gives some tips for structuring and maintaining large Coq developments. *)
|
adamc@236
|
24
|
adamc@236
|
25
|
adamc@236
|
26 (** * Ltac Anti-Patterns *)
|
adamc@236
|
27
|
adamc@237
|
28 (** In this book, I have been following an unusual style, where proofs are not considered finished until they are "fully automated," in a certain sense. SEach such theorem is proved by a single tactic. Since Ltac is a Turing-complete programming language, it is not hard to squeeze arbitrary heuristics into single tactics, using operators like the semicolon to combine steps. In contrast, most Ltac proofs "in the wild" consist of many steps, performed by individual tactics followed by periods. Is it really worth drawing a distinction between proof steps terminated by semicolons and steps terminated by periods?
|
adamc@236
|
29
|
adamc@237
|
30 I argue that this is, in fact, a very important distinction, with serious consequences for a majority of important verification domains. The more uninteresting drudge work a proof domain involves, the more important it is to work to prove theorems with single tactics. From an automation standpoint, single-tactic proofs can be extremely effective, and automation becomes more and more critical as proofs are populated by more uninteresting detail. In this section, I will give some examples of the consequences of more common proof styles.
|
adamc@236
|
31
|
adamc@236
|
32 As a running example, consider a basic language of arithmetic expressions, an interpreter for it, and a transformation that scales up every constant in an expression. *)
|
adamc@236
|
33
|
adamc@236
|
34 Inductive exp : Set :=
|
adamc@236
|
35 | Const : nat -> exp
|
adamc@236
|
36 | Plus : exp -> exp -> exp.
|
adamc@236
|
37
|
adamc@236
|
38 Fixpoint eval (e : exp) : nat :=
|
adamc@236
|
39 match e with
|
adamc@236
|
40 | Const n => n
|
adamc@236
|
41 | Plus e1 e2 => eval e1 + eval e2
|
adamc@236
|
42 end.
|
adamc@236
|
43
|
adamc@236
|
44 Fixpoint times (k : nat) (e : exp) : exp :=
|
adamc@236
|
45 match e with
|
adamc@236
|
46 | Const n => Const (k * n)
|
adamc@236
|
47 | Plus e1 e2 => Plus (times k e1) (times k e2)
|
adamc@236
|
48 end.
|
adamc@236
|
49
|
adamc@236
|
50 (** We can write a very manual proof that [double] really doubles an expression's value. *)
|
adamc@236
|
51
|
adamc@236
|
52 Theorem eval_times : forall k e,
|
adamc@236
|
53 eval (times k e) = k * eval e.
|
adamc@236
|
54 induction e.
|
adamc@236
|
55
|
adamc@236
|
56 trivial.
|
adamc@236
|
57
|
adamc@236
|
58 simpl.
|
adamc@236
|
59 rewrite IHe1.
|
adamc@236
|
60 rewrite IHe2.
|
adamc@236
|
61 rewrite mult_plus_distr_l.
|
adamc@236
|
62 trivial.
|
adamc@236
|
63 Qed.
|
adamc@236
|
64
|
adamc@236
|
65 (** We use spaces to separate the two inductive cases. The second case mentions automatically-generated hypothesis names explicitly. As a result, innocuous changes to the theorem statement can invalidate the proof. *)
|
adamc@236
|
66
|
adamc@236
|
67 Reset eval_times.
|
adamc@236
|
68
|
adamc@236
|
69 Theorem eval_double : forall k x,
|
adamc@236
|
70 eval (times k x) = k * eval x.
|
adamc@236
|
71 induction x.
|
adamc@236
|
72
|
adamc@236
|
73 trivial.
|
adamc@236
|
74
|
adamc@236
|
75 simpl.
|
adamc@236
|
76 (** [[
|
adamc@236
|
77 rewrite IHe1.
|
adamc@236
|
78
|
adamc@236
|
79 Error: The reference IHe1 was not found in the current environment.
|
adamc@236
|
80
|
adamc@236
|
81 ]]
|
adamc@236
|
82
|
adamc@236
|
83 The inductive hypotheses are named [IHx1] and [IHx2] now, not [IHe1] and [IHe2]. *)
|
adamc@236
|
84
|
adamc@236
|
85 Abort.
|
adamc@236
|
86
|
adamc@236
|
87 (** We might decide to use a more explicit invocation of [induction] to give explicit binders for all of the names that we will reference later in the proof. *)
|
adamc@236
|
88
|
adamc@236
|
89 Theorem eval_times : forall k e,
|
adamc@236
|
90 eval (times k e) = k * eval e.
|
adamc@236
|
91 induction e as [ | ? IHe1 ? IHe2 ].
|
adamc@236
|
92
|
adamc@236
|
93 trivial.
|
adamc@236
|
94
|
adamc@236
|
95 simpl.
|
adamc@236
|
96 rewrite IHe1.
|
adamc@236
|
97 rewrite IHe2.
|
adamc@236
|
98 rewrite mult_plus_distr_l.
|
adamc@236
|
99 trivial.
|
adamc@236
|
100 Qed.
|
adamc@236
|
101
|
adamc@236
|
102 (** We pass [induction] an %\textit{%#<i>#intro pattern#</i>#%}%, using a [|] character to separate out instructions for the different inductive cases. Within a case, we write [?] to ask Coq to generate a name automatically, and we write an explicit name to assign that name to the corresponding new variable. It is apparent that, to use intro patterns to avoid proof brittleness, one needs to keep track of the seemingly unimportant facts of the orders in which variables are introduced. Thus, the script keeps working if we replace [e] by [x], but it has become more cluttered. Arguably, neither proof is particularly easy to follow.
|
adamc@236
|
103
|
adamc@237
|
104 That category of complaint has to do with understanding proofs as static artifacts. As with programming in general, with serious projects, it tends to be much more important to be able to support evolution of proofs as specifications change. Unstructured proofs like the above examples can be very hard to update in concert with theorem statements. For instance, consider how the last proof script plays out when we modify [times] to introduce a bug. *)
|
adamc@236
|
105
|
adamc@236
|
106 Reset times.
|
adamc@236
|
107
|
adamc@236
|
108 Fixpoint times (k : nat) (e : exp) : exp :=
|
adamc@236
|
109 match e with
|
adamc@236
|
110 | Const n => Const (1 + k * n)
|
adamc@236
|
111 | Plus e1 e2 => Plus (times k e1) (times k e2)
|
adamc@236
|
112 end.
|
adamc@236
|
113
|
adamc@236
|
114 Theorem eval_times : forall k e,
|
adamc@236
|
115 eval (times k e) = k * eval e.
|
adamc@236
|
116 induction e as [ | ? IHe1 ? IHe2 ].
|
adamc@236
|
117
|
adamc@236
|
118 trivial.
|
adamc@236
|
119
|
adamc@236
|
120 simpl.
|
adamc@236
|
121 (** [[
|
adamc@236
|
122 rewrite IHe1.
|
adamc@236
|
123
|
adamc@236
|
124 Error: The reference IHe1 was not found in the current environment.
|
adamc@236
|
125
|
adamc@236
|
126 ]] *)
|
adamc@236
|
127
|
adamc@236
|
128 Abort.
|
adamc@236
|
129
|
adamc@237
|
130 (** Can you spot what went wrong, without stepping through the script step-by-step? The problem is that [trivial] never fails. Originally, [trivial] had been succeeding in proving an equality that follows by reflexivity. Our change to [times] leads to a case where that equality is no longer true. [trivial] happily leaves the false equality in place, and we continue on to the span of tactics intended for the second inductive case. Unfortunately, those tactics end up being applied to the %\textit{%#<i>#first#</i>#%}% case instead.
|
adamc@237
|
131
|
adamc@237
|
132 The problem with [trivial] could be "solved" by writing [solve [trivial]] instead, so that an error is signaled early on if something unexpected happens. However, the root problem is that the syntax of a tactic invocation does not imply how many subgoals it produces. Much more confusing instances of this problem are possible. For example, if a lemma [L] is modified to take an extra hypothesis, then uses of [apply L] will general more subgoals than before. Old unstructured proof scripts will become hopelessly jumbled, with tactics applied to inappropriate subgoals. Because of the lack of structure, there is usually relatively little to be gleaned from knowledge of the precise point in a proof script where an error is raised. *)
|
adamc@236
|
133
|
adamc@236
|
134 Reset times.
|
adamc@236
|
135
|
adamc@236
|
136 Fixpoint times (k : nat) (e : exp) : exp :=
|
adamc@236
|
137 match e with
|
adamc@236
|
138 | Const n => Const (k * n)
|
adamc@236
|
139 | Plus e1 e2 => Plus (times k e1) (times k e2)
|
adamc@236
|
140 end.
|
adamc@236
|
141
|
adamc@237
|
142 (** Many real developments try to make essentially unstructured proofs look structured by applying careful indentation conventions, idempotent case-marker tactics included soley to serve as documentation, and so on. All of these strategies suffer from the same kind of failure of abstraction that was just demonstrated. I like to say that if you find yourself caring about indentation in a proof script, it is a sign that the script is structured poorly.
|
adamc@236
|
143
|
adamc@236
|
144 We can rewrite the current proof with a single tactic. *)
|
adamc@236
|
145
|
adamc@236
|
146 Theorem eval_times : forall k e,
|
adamc@236
|
147 eval (times k e) = k * eval e.
|
adamc@236
|
148 induction e as [ | ? IHe1 ? IHe2 ]; [
|
adamc@236
|
149 trivial
|
adamc@236
|
150 | simpl; rewrite IHe1; rewrite IHe2; rewrite mult_plus_distr_l; trivial ].
|
adamc@236
|
151 Qed.
|
adamc@236
|
152
|
adamc@236
|
153 (** This is an improvement in robustness of the script. We no longer need to worry about tactics from one case being applied to a different case. Still, the proof script is not especially readable. Probably most readers would not find it helpful in explaining why the theorem is true.
|
adamc@236
|
154
|
adamc@236
|
155 The situation gets worse in considering extensions to the theorem we want to prove. Let us add multiplication nodes to our [exp] type and see how the proof fares. *)
|
adamc@236
|
156
|
adamc@236
|
157 Reset exp.
|
adamc@236
|
158
|
adamc@236
|
159 Inductive exp : Set :=
|
adamc@236
|
160 | Const : nat -> exp
|
adamc@236
|
161 | Plus : exp -> exp -> exp
|
adamc@236
|
162 | Mult : exp -> exp -> exp.
|
adamc@236
|
163
|
adamc@236
|
164 Fixpoint eval (e : exp) : nat :=
|
adamc@236
|
165 match e with
|
adamc@236
|
166 | Const n => n
|
adamc@236
|
167 | Plus e1 e2 => eval e1 + eval e2
|
adamc@236
|
168 | Mult e1 e2 => eval e1 * eval e2
|
adamc@236
|
169 end.
|
adamc@236
|
170
|
adamc@236
|
171 Fixpoint times (k : nat) (e : exp) : exp :=
|
adamc@236
|
172 match e with
|
adamc@236
|
173 | Const n => Const (k * n)
|
adamc@236
|
174 | Plus e1 e2 => Plus (times k e1) (times k e2)
|
adamc@236
|
175 | Mult e1 e2 => Mult (times k e1) e2
|
adamc@236
|
176 end.
|
adamc@236
|
177
|
adamc@236
|
178 Theorem eval_times : forall k e,
|
adamc@236
|
179 eval (times k e) = k * eval e.
|
adamc@236
|
180 (** [[
|
adamc@236
|
181 induction e as [ | ? IHe1 ? IHe2 ]; [
|
adamc@236
|
182 trivial
|
adamc@236
|
183 | simpl; rewrite IHe1; rewrite IHe2; rewrite mult_plus_distr_l; trivial ].
|
adamc@236
|
184
|
adamc@236
|
185 Error: Expects a disjunctive pattern with 3 branches.
|
adamc@236
|
186
|
adamc@236
|
187 ]] *)
|
adamc@236
|
188
|
adamc@236
|
189 Abort.
|
adamc@236
|
190
|
adamc@236
|
191 (** Unsurprisingly, the old proof fails, because it explicitly says that there are two inductive cases. To update the script, we must, at a minimum, remember the order in which the inductive cases are generated, so that we can insert the new case in the appropriate place. Even then, it will be painful to add the case, because we cannot walk through proof steps interactively when they occur inside an explicit set of cases. *)
|
adamc@236
|
192
|
adamc@236
|
193 Theorem eval_times : forall k e,
|
adamc@236
|
194 eval (times k e) = k * eval e.
|
adamc@236
|
195 induction e as [ | ? IHe1 ? IHe2 | ? IHe1 ? IHe2 ]; [
|
adamc@236
|
196 trivial
|
adamc@236
|
197 | simpl; rewrite IHe1; rewrite IHe2; rewrite mult_plus_distr_l; trivial
|
adamc@236
|
198 | simpl; rewrite IHe1; rewrite mult_assoc; trivial ].
|
adamc@236
|
199 Qed.
|
adamc@236
|
200
|
adamc@236
|
201 (** Now we are in a position to see how much nicer is the style of proof that we have followed in most of this book. *)
|
adamc@236
|
202
|
adamc@236
|
203 Reset eval_times.
|
adamc@236
|
204
|
adamc@238
|
205 Hint Rewrite mult_plus_distr_l : cpdt.
|
adamc@238
|
206
|
adamc@236
|
207 Theorem eval_times : forall k e,
|
adamc@236
|
208 eval (times k e) = k * eval e.
|
adamc@236
|
209 induction e; crush.
|
adamc@236
|
210 Qed.
|
adamc@236
|
211
|
adamc@237
|
212 (** This style is motivated by a hard truth: one person's manual proof script is almost always mostly inscrutable to most everyone else. I claim that step-by-step formal proofs are a poor way of conveying information. Thus, we had might as well cut out the steps and automate as much as possible.
|
adamc@237
|
213
|
adamc@237
|
214 What about the illustrative value of proofs? Most informal proofs are read to convey the big ideas of proofs. How can reading [induction e; crush] convey any big ideas? My position is that any ideas that standard automation can find are not very big after all, and the %\textit{%#<i>#real#</i>#%}% big ideas should be expressed through lemmas that are added as hints.
|
adamc@237
|
215
|
adamc@237
|
216 An example should help illustrate what I mean. Consider this function, which rewrites an expression using associativity of addition and multiplication. *)
|
adamc@237
|
217
|
adamc@237
|
218 Fixpoint reassoc (e : exp) : exp :=
|
adamc@237
|
219 match e with
|
adamc@237
|
220 | Const _ => e
|
adamc@237
|
221 | Plus e1 e2 =>
|
adamc@237
|
222 let e1' := reassoc e1 in
|
adamc@237
|
223 let e2' := reassoc e2 in
|
adamc@237
|
224 match e2' with
|
adamc@237
|
225 | Plus e21 e22 => Plus (Plus e1' e21) e22
|
adamc@237
|
226 | _ => Plus e1' e2'
|
adamc@237
|
227 end
|
adamc@237
|
228 | Mult e1 e2 =>
|
adamc@237
|
229 let e1' := reassoc e1 in
|
adamc@237
|
230 let e2' := reassoc e2 in
|
adamc@237
|
231 match e2' with
|
adamc@237
|
232 | Mult e21 e22 => Mult (Mult e1' e21) e22
|
adamc@237
|
233 | _ => Mult e1' e2'
|
adamc@237
|
234 end
|
adamc@237
|
235 end.
|
adamc@237
|
236
|
adamc@237
|
237 Theorem reassoc_correct : forall e, eval (reassoc e) = eval e.
|
adamc@237
|
238 induction e; crush;
|
adamc@237
|
239 match goal with
|
adamc@237
|
240 | [ |- context[match ?E with Const _ => _ | Plus _ _ => _ | Mult _ _ => _ end] ] =>
|
adamc@237
|
241 destruct E; crush
|
adamc@237
|
242 end.
|
adamc@237
|
243
|
adamc@237
|
244 (** One subgoal remains:
|
adamc@237
|
245 [[
|
adamc@237
|
246 IHe2 : eval e3 * eval e4 = eval e2
|
adamc@237
|
247 ============================
|
adamc@237
|
248 eval e1 * eval e3 * eval e4 = eval e1 * eval e2
|
adamc@237
|
249
|
adamc@237
|
250 ]]
|
adamc@237
|
251
|
adamc@237
|
252 [crush] does not know how to finish this goal. We could finish the proof manually. *)
|
adamc@237
|
253
|
adamc@237
|
254 rewrite <- IHe2; crush.
|
adamc@237
|
255
|
adamc@237
|
256 (** However, the proof would be easier to understand and maintain if we separated this insight into a separate lemma. *)
|
adamc@237
|
257
|
adamc@237
|
258 Abort.
|
adamc@237
|
259
|
adamc@237
|
260 Lemma rewr : forall a b c d, b * c = d -> a * b * c = a * d.
|
adamc@237
|
261 crush.
|
adamc@237
|
262 Qed.
|
adamc@237
|
263
|
adamc@237
|
264 Hint Resolve rewr.
|
adamc@237
|
265
|
adamc@237
|
266 Theorem reassoc_correct : forall e, eval (reassoc e) = eval e.
|
adamc@237
|
267 induction e; crush;
|
adamc@237
|
268 match goal with
|
adamc@237
|
269 | [ |- context[match ?E with Const _ => _ | Plus _ _ => _ | Mult _ _ => _ end] ] =>
|
adamc@237
|
270 destruct E; crush
|
adamc@237
|
271 end.
|
adamc@237
|
272 Qed.
|
adamc@237
|
273
|
adamc@237
|
274 (** In the limit, a complicated inductive proof might rely on one hint for each inductive case. The lemma for each hint could restate the associated case. Compared to manual proof scripts, we arrive at more readable results. Scripts no longer need to depend on the order in which cases are generated. The lemmas are easier to digest separately than are fragments of tactic code, since lemma statements include complete proof contexts. Such contexts can only be extracted from monolithic manual proofs by stepping through scripts interactively.
|
adamc@237
|
275
|
adamc@237
|
276 The more common situation is that a large induction has several easy cases that automation makes short work of. In the remaining cases, automation performs some standard simplification. Among these cases, some may require quite involved proofs; such a case may deserve a hint lemma of its own, where the lemma statement may copy the simplified version of the case. Alternatively, the proof script for the main theorem may be extended with some automation code targeted at the specific case. Even such targeted scripting is more desirable than manual proving, because it may be read and understood without knowledge of a proof's hierarchical structure, case ordering, or name binding structure. *)
|
adamc@237
|
277
|
adamc@235
|
278
|
adamc@238
|
279 (** * Debugging and Maintaining Automation *)
|
adamc@238
|
280
|
adamc@238
|
281 (** Fully-automated proofs are desirable because they open up possibilities for automatic adaptation to changes of specification. A well-engineered script within a narrow domain can survive many changes to the formulation of the problem it solves. Still, as we are working with higher-order logic, most theorems fall within no obvious decidable theories. It is inevitable that most long-lived automated proofs will need updating.
|
adamc@238
|
282
|
adamc@238
|
283 Before we are ready to update our proofs, we need to write them in the first place. While fully-automated scripts are most robust to changes of specification, it is hard to write every new proof directly in that form. Instead, it is useful to begin a theorem with exploratory proving and then gradually refine it into a suitable automated form.
|
adamc@238
|
284
|
adamc@238
|
285 Consider this theorem from Chapter 7, which we begin by proving in a mostly manual way, invoking [crush] after each steop to discharge any low-hanging fruit. Our manual effort involves choosing which expressions to case-analyze on. *)
|
adamc@238
|
286
|
adamc@238
|
287 (* begin hide *)
|
adamc@238
|
288 Require Import MoreDep.
|
adamc@238
|
289 (* end hide *)
|
adamc@238
|
290
|
adamc@238
|
291 Theorem cfold_correct : forall t (e : exp t), expDenote e = expDenote (cfold e).
|
adamc@238
|
292 induction e; crush.
|
adamc@238
|
293
|
adamc@238
|
294 dep_destruct (cfold e1); crush.
|
adamc@238
|
295 dep_destruct (cfold e2); crush.
|
adamc@238
|
296
|
adamc@238
|
297 dep_destruct (cfold e1); crush.
|
adamc@238
|
298 dep_destruct (cfold e2); crush.
|
adamc@238
|
299
|
adamc@238
|
300 dep_destruct (cfold e1); crush.
|
adamc@238
|
301 dep_destruct (cfold e2); crush.
|
adamc@238
|
302
|
adamc@238
|
303 dep_destruct (cfold e1); crush.
|
adamc@238
|
304 dep_destruct (expDenote e1); crush.
|
adamc@238
|
305
|
adamc@238
|
306 dep_destruct (cfold e); crush.
|
adamc@238
|
307
|
adamc@238
|
308 dep_destruct (cfold e); crush.
|
adamc@238
|
309 Qed.
|
adamc@238
|
310
|
adamc@238
|
311 (** In this complete proof, it is hard to avoid noticing a pattern. We rework the proof, abstracting over the patterns we find. *)
|
adamc@238
|
312
|
adamc@238
|
313 Reset cfold_correct.
|
adamc@238
|
314
|
adamc@238
|
315 Theorem cfold_correct : forall t (e : exp t), expDenote e = expDenote (cfold e).
|
adamc@238
|
316 induction e; crush.
|
adamc@238
|
317
|
adamc@238
|
318 (** The expression we want to destruct here turns out to be the discriminee of a [match], and we can easily enough write a tactic that destructs all such expressions. *)
|
adamc@238
|
319
|
adamc@238
|
320 Ltac t :=
|
adamc@238
|
321 repeat (match goal with
|
adamc@238
|
322 | [ |- context[match ?E with NConst _ => _ | Plus _ _ => _
|
adamc@238
|
323 | Eq _ _ => _ | BConst _ => _ | And _ _ => _
|
adamc@238
|
324 | If _ _ _ _ => _ | Pair _ _ _ _ => _
|
adamc@238
|
325 | Fst _ _ _ => _ | Snd _ _ _ => _ end] ] =>
|
adamc@238
|
326 dep_destruct E
|
adamc@238
|
327 end; crush).
|
adamc@238
|
328
|
adamc@238
|
329 t.
|
adamc@238
|
330
|
adamc@238
|
331 (** This tactic invocation discharges the whole case. It does the same on the next two cases, but it gets stuck on the fourth case. *)
|
adamc@238
|
332
|
adamc@238
|
333 t.
|
adamc@238
|
334
|
adamc@238
|
335 t.
|
adamc@238
|
336
|
adamc@238
|
337 t.
|
adamc@238
|
338
|
adamc@238
|
339 (** The subgoal's conclusion is:
|
adamc@238
|
340 [[
|
adamc@238
|
341 ============================
|
adamc@238
|
342 (if expDenote e1 then expDenote (cfold e2) else expDenote (cfold e3)) =
|
adamc@238
|
343 expDenote (if expDenote e1 then cfold e2 else cfold e3)
|
adamc@238
|
344
|
adamc@238
|
345 ]]
|
adamc@238
|
346
|
adamc@238
|
347 We need to expand our [t] tactic to handle this case. *)
|
adamc@238
|
348
|
adamc@238
|
349 Ltac t' :=
|
adamc@238
|
350 repeat (match goal with
|
adamc@238
|
351 | [ |- context[match ?E with NConst _ => _ | Plus _ _ => _
|
adamc@238
|
352 | Eq _ _ => _ | BConst _ => _ | And _ _ => _
|
adamc@238
|
353 | If _ _ _ _ => _ | Pair _ _ _ _ => _
|
adamc@238
|
354 | Fst _ _ _ => _ | Snd _ _ _ => _ end] ] =>
|
adamc@238
|
355 dep_destruct E
|
adamc@238
|
356 | [ |- (if ?E then _ else _) = _ ] => destruct E
|
adamc@238
|
357 end; crush).
|
adamc@238
|
358
|
adamc@238
|
359 t'.
|
adamc@238
|
360
|
adamc@238
|
361 (** Now the goal is discharged, but [t'] has no effect on the next subgoal. *)
|
adamc@238
|
362
|
adamc@238
|
363 t'.
|
adamc@238
|
364
|
adamc@238
|
365 (** A final revision of [t] finishes the proof. *)
|
adamc@238
|
366
|
adamc@238
|
367 Ltac t'' :=
|
adamc@238
|
368 repeat (match goal with
|
adamc@238
|
369 | [ |- context[match ?E with NConst _ => _ | Plus _ _ => _
|
adamc@238
|
370 | Eq _ _ => _ | BConst _ => _ | And _ _ => _
|
adamc@238
|
371 | If _ _ _ _ => _ | Pair _ _ _ _ => _
|
adamc@238
|
372 | Fst _ _ _ => _ | Snd _ _ _ => _ end] ] =>
|
adamc@238
|
373 dep_destruct E
|
adamc@238
|
374 | [ |- (if ?E then _ else _) = _ ] => destruct E
|
adamc@238
|
375 | [ |- context[match pairOut ?E with Some _ => _
|
adamc@238
|
376 | None => _ end] ] =>
|
adamc@238
|
377 dep_destruct E
|
adamc@238
|
378 end; crush).
|
adamc@238
|
379
|
adamc@238
|
380 t''.
|
adamc@238
|
381
|
adamc@238
|
382 t''.
|
adamc@238
|
383 Qed.
|
adamc@238
|
384
|
adamc@238
|
385 (** We can take the final tactic and move it into the initial part of the proof script, arriving at a nicely-automated proof. *)
|
adamc@238
|
386
|
adamc@238
|
387 Reset t.
|
adamc@238
|
388
|
adamc@238
|
389 Theorem cfold_correct : forall t (e : exp t), expDenote e = expDenote (cfold e).
|
adamc@238
|
390 induction e; crush;
|
adamc@238
|
391 repeat (match goal with
|
adamc@238
|
392 | [ |- context[match ?E with NConst _ => _ | Plus _ _ => _
|
adamc@238
|
393 | Eq _ _ => _ | BConst _ => _ | And _ _ => _
|
adamc@238
|
394 | If _ _ _ _ => _ | Pair _ _ _ _ => _
|
adamc@238
|
395 | Fst _ _ _ => _ | Snd _ _ _ => _ end] ] =>
|
adamc@238
|
396 dep_destruct E
|
adamc@238
|
397 | [ |- (if ?E then _ else _) = _ ] => destruct E
|
adamc@238
|
398 | [ |- context[match pairOut ?E with Some _ => _
|
adamc@238
|
399 | None => _ end] ] =>
|
adamc@238
|
400 dep_destruct E
|
adamc@238
|
401 end; crush).
|
adamc@238
|
402 Qed.
|
adamc@238
|
403
|
adamc@240
|
404 (** Even after we put together nice automated proofs, we must deal with specification changes that can invalidate them. It is not generally possible to step through single-tactic proofs interactively. There is a command [Debug On] that lets us step through points in tactic execution, but the debugger tends to make counterintuitive choices of which points we would like to stop at, and per-point output is quite verbose, so most Coq users do not find this debugging mode very helpful. How are we to understand what has broken in a script that used to work?
|
adamc@240
|
405
|
adamc@240
|
406 An example helps demonstrate a useful approach. Consider what would have happened in our proof of [reassoc_correct] if we had first added an unfortunate rewriting hint. *)
|
adamc@240
|
407
|
adamc@240
|
408 Reset reassoc_correct.
|
adamc@240
|
409
|
adamc@240
|
410 Theorem confounder : forall e1 e2 e3,
|
adamc@240
|
411 eval e1 * eval e2 * eval e3 = eval e1 * (eval e2 + 1 - 1) * eval e3.
|
adamc@240
|
412 crush.
|
adamc@240
|
413 Qed.
|
adamc@240
|
414
|
adamc@240
|
415 Hint Rewrite confounder : cpdt.
|
adamc@240
|
416
|
adamc@240
|
417 Theorem reassoc_correct : forall e, eval (reassoc e) = eval e.
|
adamc@240
|
418 induction e; crush;
|
adamc@240
|
419 match goal with
|
adamc@240
|
420 | [ |- context[match ?E with Const _ => _ | Plus _ _ => _ | Mult _ _ => _ end] ] =>
|
adamc@240
|
421 destruct E; crush
|
adamc@240
|
422 end.
|
adamc@240
|
423
|
adamc@240
|
424 (** One subgoal remains:
|
adamc@240
|
425
|
adamc@240
|
426 [[
|
adamc@240
|
427 ============================
|
adamc@240
|
428 eval e1 * (eval e3 + 1 - 1) * eval e4 = eval e1 * eval e2
|
adamc@240
|
429
|
adamc@240
|
430 ]]
|
adamc@240
|
431
|
adamc@240
|
432 The poorly-chosen rewrite rule fired, changing the goal to a form where another hint no longer applies. Imagine that we are in the middle of a large development with many hints. How would we diagnose the problem? First, we might not be sure which case of the inductive proof has gone wrong. It is useful to separate out our automation procedure and apply it manually. *)
|
adamc@240
|
433
|
adamc@240
|
434 Restart.
|
adamc@240
|
435
|
adamc@240
|
436 Ltac t := crush; match goal with
|
adamc@240
|
437 | [ |- context[match ?E with Const _ => _ | Plus _ _ => _
|
adamc@240
|
438 | Mult _ _ => _ end] ] =>
|
adamc@240
|
439 destruct E; crush
|
adamc@240
|
440 end.
|
adamc@240
|
441
|
adamc@240
|
442 induction e.
|
adamc@240
|
443
|
adamc@240
|
444 (** Since we see the subgoals before any simplification occurs, it is clear that this is the case for constants. [t] makes short work of it. *)
|
adamc@240
|
445
|
adamc@240
|
446 t.
|
adamc@240
|
447
|
adamc@240
|
448 (** The next subgoal, for addition, is also discharged without trouble. *)
|
adamc@240
|
449
|
adamc@240
|
450 t.
|
adamc@240
|
451
|
adamc@240
|
452 (** The final subgoal is for multiplication, and it is here that we get stuck in the proof state summarized above. *)
|
adamc@240
|
453
|
adamc@240
|
454 t.
|
adamc@240
|
455
|
adamc@240
|
456 (** What is [t] doing to get us to this point? The [info] command can help us answer this kind of question. *)
|
adamc@240
|
457
|
adamc@240
|
458 (** remove printing * *)
|
adamc@240
|
459 Undo.
|
adamc@240
|
460 info t.
|
adamc@240
|
461 (** [[
|
adamc@240
|
462 == simpl in *; intuition; subst; autorewrite with cpdt in *;
|
adamc@240
|
463 simpl in *; intuition; subst; autorewrite with cpdt in *;
|
adamc@240
|
464 simpl in *; intuition; subst; destruct (reassoc e2).
|
adamc@240
|
465 simpl in *; intuition.
|
adamc@240
|
466
|
adamc@240
|
467 simpl in *; intuition.
|
adamc@240
|
468
|
adamc@240
|
469 simpl in *; intuition; subst; autorewrite with cpdt in *;
|
adamc@240
|
470 refine (eq_ind_r
|
adamc@240
|
471 (fun n : nat =>
|
adamc@240
|
472 n * (eval e3 + 1 - 1) * eval e4 = eval e1 * eval e2) _ IHe1);
|
adamc@240
|
473 autorewrite with cpdt in *; simpl in *; intuition;
|
adamc@240
|
474 subst; autorewrite with cpdt in *; simpl in *;
|
adamc@240
|
475 intuition; subst.
|
adamc@240
|
476
|
adamc@240
|
477 ]]
|
adamc@240
|
478
|
adamc@240
|
479 A detailed trace of [t]'s execution appears. Since we are using the very general [crush] tactic, many of these steps have no effect and only occur as instances of a more general strategy. We can copy-and-paste the details to see where things go wrong. *)
|
adamc@240
|
480
|
adamc@240
|
481 Undo.
|
adamc@240
|
482
|
adamc@240
|
483 (** We arbitrarily split the script into chunks. The first few seem not to do any harm. *)
|
adamc@240
|
484
|
adamc@240
|
485 simpl in *; intuition; subst; autorewrite with cpdt in *.
|
adamc@240
|
486 simpl in *; intuition; subst; autorewrite with cpdt in *.
|
adamc@240
|
487 simpl in *; intuition; subst; destruct (reassoc e2).
|
adamc@240
|
488 simpl in *; intuition.
|
adamc@240
|
489 simpl in *; intuition.
|
adamc@240
|
490
|
adamc@240
|
491 (** The next step is revealed as the culprit, bringing us to the final unproved subgoal. *)
|
adamc@240
|
492
|
adamc@240
|
493 simpl in *; intuition; subst; autorewrite with cpdt in *.
|
adamc@240
|
494
|
adamc@240
|
495 (** We can split the steps further to assign blame. *)
|
adamc@240
|
496
|
adamc@240
|
497 Undo.
|
adamc@240
|
498
|
adamc@240
|
499 simpl in *.
|
adamc@240
|
500 intuition.
|
adamc@240
|
501 subst.
|
adamc@240
|
502 autorewrite with cpdt in *.
|
adamc@240
|
503
|
adamc@240
|
504 (** It was the final of these four tactics that made the rewrite. We can find out exactly what happened. The [info] command presents hierarchical views of proof steps, and we can zoom down to a lower level of detail by applying [info] to one of the steps that appeared in the original trace. *)
|
adamc@240
|
505
|
adamc@240
|
506 Undo.
|
adamc@240
|
507
|
adamc@240
|
508 info autorewrite with cpdt in *.
|
adamc@240
|
509 (** [[
|
adamc@240
|
510 == refine (eq_ind_r (fun n : nat => n = eval e1 * eval e2) _
|
adamc@240
|
511 (confounder (reassoc e1) e3 e4)).
|
adamc@240
|
512
|
adamc@240
|
513 ]]
|
adamc@240
|
514
|
adamc@240
|
515 The way a rewrite is displayed is somewhat baroque, but we can see that theorem [confounder] is the final culprit. At this point, we could remove that hint, prove an alternate version of the key lemma [rewr], or come up with some other remedy. Fixing this kind of problem tends to be relatively easy once the problem is revealed. *)
|
adamc@240
|
516
|
adamc@240
|
517 Abort.
|
adamc@240
|
518
|
adamc@240
|
519 (** printing * $\times$ *)
|
adamc@240
|
520
|
adamc@239
|
521 Section slow.
|
adamc@239
|
522 Hint Resolve trans_eq.
|
adamc@239
|
523
|
adamc@239
|
524 Variable A : Set.
|
adamc@239
|
525 Variables P Q R S : A -> A -> Prop.
|
adamc@239
|
526 Variable f : A -> A.
|
adamc@239
|
527
|
adamc@239
|
528 Hypothesis H1 : forall x y, P x y -> Q x y -> R x y -> f x = f y.
|
adamc@239
|
529 Hypothesis H2 : forall x y, S x y -> R x y.
|
adamc@239
|
530
|
adamc@239
|
531 Lemma slow : forall x y, P x y -> Q x y -> S x y -> f x = f y.
|
adamc@239
|
532 debug eauto.
|
adamc@239
|
533 Qed.
|
adamc@239
|
534
|
adamc@239
|
535 Hypothesis H3 : forall x y, x = y -> f x = f y.
|
adamc@239
|
536
|
adamc@239
|
537 Lemma slow' : forall x y, P x y -> Q x y -> S x y -> f x = f y.
|
adamc@239
|
538 debug eauto.
|
adamc@239
|
539 Qed.
|
adamc@239
|
540 End slow.
|
adamc@239
|
541
|
adamc@238
|
542
|
adamc@235
|
543 (** * Modules *)
|
adamc@235
|
544
|
adamc@235
|
545 Module Type GROUP.
|
adamc@235
|
546 Parameter G : Set.
|
adamc@235
|
547 Parameter f : G -> G -> G.
|
adamc@235
|
548 Parameter e : G.
|
adamc@235
|
549 Parameter i : G -> G.
|
adamc@235
|
550
|
adamc@235
|
551 Axiom assoc : forall a b c, f (f a b) c = f a (f b c).
|
adamc@235
|
552 Axiom ident : forall a, f e a = a.
|
adamc@235
|
553 Axiom inverse : forall a, f (i a) a = e.
|
adamc@235
|
554 End GROUP.
|
adamc@235
|
555
|
adamc@235
|
556 Module Type GROUP_THEOREMS.
|
adamc@235
|
557 Declare Module M : GROUP.
|
adamc@235
|
558
|
adamc@235
|
559 Axiom ident' : forall a, M.f a M.e = a.
|
adamc@235
|
560
|
adamc@235
|
561 Axiom inverse' : forall a, M.f a (M.i a) = M.e.
|
adamc@235
|
562
|
adamc@235
|
563 Axiom unique_ident : forall e', (forall a, M.f e' a = a) -> e' = M.e.
|
adamc@235
|
564 End GROUP_THEOREMS.
|
adamc@235
|
565
|
adamc@239
|
566 Module Group (M : GROUP) : GROUP_THEOREMS with Module M := M.
|
adamc@235
|
567 Module M := M.
|
adamc@235
|
568
|
adamc@235
|
569 Import M.
|
adamc@235
|
570
|
adamc@235
|
571 Theorem inverse' : forall a, f a (i a) = e.
|
adamc@235
|
572 intro.
|
adamc@235
|
573 rewrite <- (ident (f a (i a))).
|
adamc@235
|
574 rewrite <- (inverse (f a (i a))) at 1.
|
adamc@235
|
575 rewrite assoc.
|
adamc@235
|
576 rewrite assoc.
|
adamc@235
|
577 rewrite <- (assoc (i a) a (i a)).
|
adamc@235
|
578 rewrite inverse.
|
adamc@235
|
579 rewrite ident.
|
adamc@235
|
580 apply inverse.
|
adamc@235
|
581 Qed.
|
adamc@235
|
582
|
adamc@235
|
583 Theorem ident' : forall a, f a e = a.
|
adamc@235
|
584 intro.
|
adamc@235
|
585 rewrite <- (inverse a).
|
adamc@235
|
586 rewrite <- assoc.
|
adamc@235
|
587 rewrite inverse'.
|
adamc@235
|
588 apply ident.
|
adamc@235
|
589 Qed.
|
adamc@235
|
590
|
adamc@235
|
591 Theorem unique_ident : forall e', (forall a, M.f e' a = a) -> e' = M.e.
|
adamc@235
|
592 intros.
|
adamc@235
|
593 rewrite <- (H e).
|
adamc@235
|
594 symmetry.
|
adamc@235
|
595 apply ident'.
|
adamc@235
|
596 Qed.
|
adamc@235
|
597 End Group.
|
adamc@239
|
598
|
adamc@239
|
599 Require Import ZArith.
|
adamc@239
|
600 Open Scope Z_scope.
|
adamc@239
|
601
|
adamc@239
|
602 Module Int.
|
adamc@239
|
603 Definition G := Z.
|
adamc@239
|
604 Definition f x y := x + y.
|
adamc@239
|
605 Definition e := 0.
|
adamc@239
|
606 Definition i x := -x.
|
adamc@239
|
607
|
adamc@239
|
608 Theorem assoc : forall a b c, f (f a b) c = f a (f b c).
|
adamc@239
|
609 unfold f; crush.
|
adamc@239
|
610 Qed.
|
adamc@239
|
611 Theorem ident : forall a, f e a = a.
|
adamc@239
|
612 unfold f, e; crush.
|
adamc@239
|
613 Qed.
|
adamc@239
|
614 Theorem inverse : forall a, f (i a) a = e.
|
adamc@239
|
615 unfold f, i, e; crush.
|
adamc@239
|
616 Qed.
|
adamc@239
|
617 End Int.
|
adamc@239
|
618
|
adamc@239
|
619 Module IntTheorems := Group(Int).
|
adamc@239
|
620
|
adamc@239
|
621 Check IntTheorems.unique_ident.
|
adamc@239
|
622
|
adamc@239
|
623 Theorem unique_ident : forall e', (forall a, e' + a = a) -> e' = 0.
|
adamc@239
|
624 exact IntTheorems.unique_ident.
|
adamc@239
|
625 Qed.
|