comparison src/Large.v @ 367:b809d3a8a5b1

Pass over old Large material; index fixes
author Adam Chlipala <adam@chlipala.net>
date Tue, 08 Nov 2011 11:54:09 -0500
parents ad315efc3b6b
children e0c5b91e2968
comparison
equal deleted inserted replaced
366:03e200599633 367:b809d3a8a5b1
1 (* Copyright (c) 2009-2010, Adam Chlipala 1 (* Copyright (c) 2009-2011, Adam Chlipala
2 * 2 *
3 * This work is licensed under a 3 * This work is licensed under a
4 * Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 4 * Creative Commons Attribution-Noncommercial-No Derivative Works 3.0
5 * Unported License. 5 * Unported License.
6 * The license text is available at: 6 * The license text is available at:
16 (* end hide *) 16 (* end hide *)
17 17
18 18
19 (** %\chapter{Proving in the Large}% *) 19 (** %\chapter{Proving in the Large}% *)
20 20
21 (** It is somewhat unfortunate that the term %``%#"#theorem-proving#"#%''% looks so much like the word %``%#"#theory.#"#%''% Most researchers and practitioners in software assume that mechanized theorem-proving is profoundly impractical. Indeed, until recently, most advances in theorem-proving for higher-order logics have been largely theoretical. However, starting around the beginning of the 21st century, there was a surge in the use of proof assistants in serious verification efforts. That line of work is still quite new, but I believe it is not too soon to distill some lessons on how to work effectively with large formal proofs. 21 (** It is somewhat unfortunate that the term %``%#"#theorem proving#"#%''% looks so much like the word %``%#"#theory.#"#%''% Most researchers and practitioners in software assume that mechanized theorem proving is profoundly impractical. Indeed, until recently, most advances in theorem proving for higher-order logics have been largely theoretical. However, starting around the beginning of the 21st century, there was a surge in the use of proof assistants in serious verification efforts. That line of work is still quite new, but I believe it is not too soon to distill some lessons on how to work effectively with large formal proofs.
22 22
23 Thus, this chapter gives some tips for structuring and maintaining large Coq developments. *) 23 Thus, this chapter gives some tips for structuring and maintaining large Coq developments. *)
24 24
25 25
26 (** * Ltac Anti-Patterns *) 26 (** * Ltac Anti-Patterns *)
27 27
28 (** In this book, I have been following an unusual style, where proofs are not considered finished until they are %``%#"#fully automated,#"#%''% in a certain sense. Each such theorem is proved by a single tactic. Since Ltac is a Turing-complete programming language, it is not hard to squeeze arbitrary heuristics into single tactics, using operators like the semicolon to combine steps. In contrast, most Ltac proofs %``%#"#in the wild#"#%''% consist of many steps, performed by individual tactics followed by periods. Is it really worth drawing a distinction between proof steps terminated by semicolons and steps terminated by periods? 28 (** In this book, I have been following an unusual style, where proofs are not considered finished until they are %\index{fully automated proofs}``%#"#fully automated,#"#%''% in a certain sense. Each such theorem is proved by a single tactic. Since Ltac is a Turing-complete programming language, it is not hard to squeeze arbitrary heuristics into single tactics, using operators like the semicolon to combine steps. In contrast, most Ltac proofs %``%#"#in the wild#"#%''% consist of many steps, performed by individual tactics followed by periods. Is it really worth drawing a distinction between proof steps terminated by semicolons and steps terminated by periods?
29 29
30 I argue that this is, in fact, a very important distinction, with serious consequences for a majority of important verification domains. The more uninteresting drudge work a proof domain involves, the more important it is to work to prove theorems with single tactics. From an automation standpoint, single-tactic proofs can be extremely effective, and automation becomes more and more critical as proofs are populated by more uninteresting detail. In this section, I will give some examples of the consequences of more common proof styles. 30 I argue that this is, in fact, a very important distinction, with serious consequences for a majority of important verification domains. The more uninteresting drudge work a proof domain involves, the more important it is to work to prove theorems with single tactics. From an automation standpoint, single-tactic proofs can be extremely effective, and automation becomes more and more critical as proofs are populated by more uninteresting detail. In this section, I will give some examples of the consequences of more common proof styles.
31 31
32 As a running example, consider a basic language of arithmetic expressions, an interpreter for it, and a transformation that scales up every constant in an expression. *) 32 As a running example, consider a basic language of arithmetic expressions, an interpreter for it, and a transformation that scales up every constant in an expression. *)
33 33
60 rewrite IHe2. 60 rewrite IHe2.
61 rewrite mult_plus_distr_l. 61 rewrite mult_plus_distr_l.
62 trivial. 62 trivial.
63 Qed. 63 Qed.
64 64
65 (** We use spaces to separate the two inductive cases. The second case mentions automatically-generated hypothesis names explicitly. As a result, innocuous changes to the theorem statement can invalidate the proof. *) 65 (** We use spaces to separate the two inductive cases, but note that these spaces have no real semantic content; Coq does not enforce that our spacing matches the real case structure of a proof. The second case mentions automatically generated hypothesis names explicitly. As a result, innocuous changes to the theorem statement can invalidate the proof. *)
66 66
67 Reset eval_times. 67 Reset eval_times.
68 68
69 Theorem eval_double : forall k x, 69 Theorem eval_double : forall k x,
70 eval (times k x) = k * eval x. 70 eval (times k x) = k * eval x.
71 induction x. 71 induction x.
72 72
73 trivial. 73 trivial.
74 74
75 simpl. 75 simpl.
76 (** [[ 76 (** %\vspace{-.15in}%[[
77 rewrite IHe1. 77 rewrite IHe1.
78 78 ]]
79
80 <<
79 Error: The reference IHe1 was not found in the current environment. 81 Error: The reference IHe1 was not found in the current environment.
80 82 >>
81 ]]
82 83
83 The inductive hypotheses are named [IHx1] and [IHx2] now, not [IHe1] and [IHe2]. *) 84 The inductive hypotheses are named [IHx1] and [IHx2] now, not [IHe1] and [IHe2]. *)
84 85
85 Abort. 86 Abort.
86 87
97 rewrite IHe2. 98 rewrite IHe2.
98 rewrite mult_plus_distr_l. 99 rewrite mult_plus_distr_l.
99 trivial. 100 trivial.
100 Qed. 101 Qed.
101 102
102 (** We pass [induction] an %\textit{%#<i>#intro pattern#</i>#%}%, using a [|] character to separate out instructions for the different inductive cases. Within a case, we write [?] to ask Coq to generate a name automatically, and we write an explicit name to assign that name to the corresponding new variable. It is apparent that, to use intro patterns to avoid proof brittleness, one needs to keep track of the seemingly unimportant facts of the orders in which variables are introduced. Thus, the script keeps working if we replace [e] by [x], but it has become more cluttered. Arguably, neither proof is particularly easy to follow. 103 (** We pass %\index{tactics!induction}%[induction] an %\index{intro pattern}\textit{%#<i>#intro pattern#</i>#%}%, using a [|] character to separate out instructions for the different inductive cases. Within a case, we write [?] to ask Coq to generate a name automatically, and we write an explicit name to assign that name to the corresponding new variable. It is apparent that, to use intro patterns to avoid proof brittleness, one needs to keep track of the seemingly unimportant facts of the orders in which variables are introduced. Thus, the script keeps working if we replace [e] by [x], but it has become more cluttered. Arguably, neither proof is particularly easy to follow.
103 104
104 That category of complaint has to do with understanding proofs as static artifacts. As with programming in general, with serious projects, it tends to be much more important to be able to support evolution of proofs as specifications change. Unstructured proofs like the above examples can be very hard to update in concert with theorem statements. For instance, consider how the last proof script plays out when we modify [times] to introduce a bug. *) 105 That category of complaint has to do with understanding proofs as static artifacts. As with programming in general, with serious projects, it tends to be much more important to be able to support evolution of proofs as specifications change. Unstructured proofs like the above examples can be very hard to update in concert with theorem statements. For instance, consider how the last proof script plays out when we modify [times] to introduce a bug. *)
105 106
106 Reset times. 107 Reset times.
107 108
116 induction e as [ | ? IHe1 ? IHe2 ]. 117 induction e as [ | ? IHe1 ? IHe2 ].
117 118
118 trivial. 119 trivial.
119 120
120 simpl. 121 simpl.
121 (** [[ 122 (** %\vspace{-.15in}%[[
122 rewrite IHe1. 123 rewrite IHe1.
123 124 ]]
125
126 <<
124 Error: The reference IHe1 was not found in the current environment. 127 Error: The reference IHe1 was not found in the current environment.
125 128 >>
126 ]]
127 *) 129 *)
128 130
129 Abort. 131 Abort.
130 132
131 (** Can you spot what went wrong, without stepping through the script step-by-step? The problem is that [trivial] never fails. Originally, [trivial] had been succeeding in proving an equality that follows by reflexivity. Our change to [times] leads to a case where that equality is no longer true. [trivial] happily leaves the false equality in place, and we continue on to the span of tactics intended for the second inductive case. Unfortunately, those tactics end up being applied to the %\textit{%#<i>#first#</i>#%}% case instead. 133 (** Can you spot what went wrong, without stepping through the script step-by-step? The problem is that [trivial] never fails. Originally, [trivial] had been succeeding in proving an equality that follows by reflexivity. Our change to [times] leads to a case where that equality is no longer true. The invocation [trivial] happily leaves the false equality in place, and we continue on to the span of tactics intended for the second inductive case. Unfortunately, those tactics end up being applied to the %\textit{%#<i>#first#</i>#%}% case instead.
132 134
133 The problem with [trivial] could be %``%#"#solved#"#%''% by writing [solve [trivial]] instead, so that an error is signaled early on if something unexpected happens. However, the root problem is that the syntax of a tactic invocation does not imply how many subgoals it produces. Much more confusing instances of this problem are possible. For example, if a lemma [L] is modified to take an extra hypothesis, then uses of [apply L] will general more subgoals than before. Old unstructured proof scripts will become hopelessly jumbled, with tactics applied to inappropriate subgoals. Because of the lack of structure, there is usually relatively little to be gleaned from knowledge of the precise point in a proof script where an error is raised. *) 135 The problem with [trivial] could be %``%#"#solved#"#%''% by writing, e.g., [trivial; fail] instead, so that an error is signaled early on if something unexpected happens. However, the root problem is that the syntax of a tactic invocation does not imply how many subgoals it produces. Much more confusing instances of this problem are possible. For example, if a lemma [L] is modified to take an extra hypothesis, then uses of [apply L] will general more subgoals than before. Old unstructured proof scripts will become hopelessly jumbled, with tactics applied to inappropriate subgoals. Because of the lack of structure, there is usually relatively little to be gleaned from knowledge of the precise point in a proof script where an error is raised. *)
134 136
135 Reset times. 137 Reset times.
136 138
137 Fixpoint times (k : nat) (e : exp) : exp := 139 Fixpoint times (k : nat) (e : exp) : exp :=
138 match e with 140 match e with
176 | Mult e1 e2 => Mult (times k e1) e2 178 | Mult e1 e2 => Mult (times k e1) e2
177 end. 179 end.
178 180
179 Theorem eval_times : forall k e, 181 Theorem eval_times : forall k e,
180 eval (times k e) = k * eval e. 182 eval (times k e) = k * eval e.
181 (** [[ 183 (** %\vspace{-.25in}%[[
182 induction e as [ | ? IHe1 ? IHe2 ]; [ 184 induction e as [ | ? IHe1 ? IHe2 ]; [
183 trivial 185 trivial
184 | simpl; rewrite IHe1; rewrite IHe2; rewrite mult_plus_distr_l; trivial ]. 186 | simpl; rewrite IHe1; rewrite IHe2; rewrite mult_plus_distr_l; trivial ].
185 187 ]]
188
189 <<
186 Error: Expects a disjunctive pattern with 3 branches. 190 Error: Expects a disjunctive pattern with 3 branches.
187 191 >>
188 ]]
189 *) 192 *)
190 193
191 Abort. 194 Abort.
192 195
193 (** Unsurprisingly, the old proof fails, because it explicitly says that there are two inductive cases. To update the script, we must, at a minimum, remember the order in which the inductive cases are generated, so that we can insert the new case in the appropriate place. Even then, it will be painful to add the case, because we cannot walk through proof steps interactively when they occur inside an explicit set of cases. *) 196 (** Unsurprisingly, the old proof fails, because it explicitly says that there are two inductive cases. To update the script, we must, at a minimum, remember the order in which the inductive cases are generated, so that we can insert the new case in the appropriate place. Even then, it will be painful to add the case, because we cannot walk through proof steps interactively when they occur inside an explicit set of cases. *)
246 (** One subgoal remains: 249 (** One subgoal remains:
247 [[ 250 [[
248 IHe2 : eval e3 * eval e4 = eval e2 251 IHe2 : eval e3 * eval e4 = eval e2
249 ============================ 252 ============================
250 eval e1 * eval e3 * eval e4 = eval e1 * eval e2 253 eval e1 * eval e3 * eval e4 = eval e1 * eval e2
251
252 ]] 254 ]]
253 255
254 [crush] does not know how to finish this goal. We could finish the proof manually. *) 256 [crush] does not know how to finish this goal. We could finish the proof manually. *)
255 257
256 rewrite <- IHe2; crush. 258 rewrite <- IHe2; crush.
278 The more common situation is that a large induction has several easy cases that automation makes short work of. In the remaining cases, automation performs some standard simplification. Among these cases, some may require quite involved proofs; such a case may deserve a hint lemma of its own, where the lemma statement may copy the simplified version of the case. Alternatively, the proof script for the main theorem may be extended with some automation code targeted at the specific case. Even such targeted scripting is more desirable than manual proving, because it may be read and understood without knowledge of a proof's hierarchical structure, case ordering, or name binding structure. *) 280 The more common situation is that a large induction has several easy cases that automation makes short work of. In the remaining cases, automation performs some standard simplification. Among these cases, some may require quite involved proofs; such a case may deserve a hint lemma of its own, where the lemma statement may copy the simplified version of the case. Alternatively, the proof script for the main theorem may be extended with some automation code targeted at the specific case. Even such targeted scripting is more desirable than manual proving, because it may be read and understood without knowledge of a proof's hierarchical structure, case ordering, or name binding structure. *)
279 281
280 282
281 (** * Debugging and Maintaining Automation *) 283 (** * Debugging and Maintaining Automation *)
282 284
283 (** Fully-automated proofs are desirable because they open up possibilities for automatic adaptation to changes of specification. A well-engineered script within a narrow domain can survive many changes to the formulation of the problem it solves. Still, as we are working with higher-order logic, most theorems fall within no obvious decidable theories. It is inevitable that most long-lived automated proofs will need updating. 285 (** Fully automated proofs are desirable because they open up possibilities for automatic adaptation to changes of specification. A well-engineered script within a narrow domain can survive many changes to the formulation of the problem it solves. Still, as we are working with higher-order logic, most theorems fall within no obvious decidable theories. It is inevitable that most long-lived automated proofs will need updating.
284 286
285 Before we are ready to update our proofs, we need to write them in the first place. While fully-automated scripts are most robust to changes of specification, it is hard to write every new proof directly in that form. Instead, it is useful to begin a theorem with exploratory proving and then gradually refine it into a suitable automated form. 287 Before we are ready to update our proofs, we need to write them in the first place. While fully automated scripts are most robust to changes of specification, it is hard to write every new proof directly in that form. Instead, it is useful to begin a theorem with exploratory proving and then gradually refine it into a suitable automated form.
286 288
287 Consider this theorem from Chapter 8, which we begin by proving in a mostly manual way, invoking [crush] after each steop to discharge any low-hanging fruit. Our manual effort involves choosing which expressions to case-analyze on. *) 289 Consider this theorem from Chapter 8, which we begin by proving in a mostly manual way, invoking [crush] after each steop to discharge any low-hanging fruit. Our manual effort involves choosing which expressions to case-analyze on. *)
288 290
289 (* begin hide *) 291 (* begin hide *)
290 Require Import MoreDep. 292 Require Import MoreDep.
341 (** The subgoal's conclusion is: 343 (** The subgoal's conclusion is:
342 [[ 344 [[
343 ============================ 345 ============================
344 (if expDenote e1 then expDenote (cfold e2) else expDenote (cfold e3)) = 346 (if expDenote e1 then expDenote (cfold e2) else expDenote (cfold e3)) =
345 expDenote (if expDenote e1 then cfold e2 else cfold e3) 347 expDenote (if expDenote e1 then cfold e2 else cfold e3)
346
347 ]] 348 ]]
348 349
349 We need to expand our [t] tactic to handle this case. *) 350 We need to expand our [t] tactic to handle this case. *)
350 351
351 Ltac t' := 352 Ltac t' :=
382 t''. 383 t''.
383 384
384 t''. 385 t''.
385 Qed. 386 Qed.
386 387
387 (** We can take the final tactic and move it into the initial part of the proof script, arriving at a nicely-automated proof. *) 388 (** We can take the final tactic and move it into the initial part of the proof script, arriving at a nicely automated proof. *)
388 389
389 Reset t. 390 Reset t.
390 391
391 Theorem cfold_correct : forall t (e : exp t), expDenote e = expDenote (cfold e). 392 Theorem cfold_correct : forall t (e : exp t), expDenote e = expDenote (cfold e).
392 induction e; crush; 393 induction e; crush;
401 | None => _ end] ] => 402 | None => _ end] ] =>
402 dep_destruct E 403 dep_destruct E
403 end; crush). 404 end; crush).
404 Qed. 405 Qed.
405 406
406 (** Even after we put together nice automated proofs, we must deal with specification changes that can invalidate them. It is not generally possible to step through single-tactic proofs interactively. There is a command [Debug On] that lets us step through points in tactic execution, but the debugger tends to make counterintuitive choices of which points we would like to stop at, and per-point output is quite verbose, so most Coq users do not find this debugging mode very helpful. How are we to understand what has broken in a script that used to work? 407 (** Even after we put together nice automated proofs, we must deal with specification changes that can invalidate them. It is not generally possible to step through single-tactic proofs interactively. There is a command %\index{Vernacular commands!Debug On}%[Debug On] that lets us step through points in tactic execution, but the debugger tends to make counterintuitive choices of which points we would like to stop at, and per-point output is quite verbose, so most Coq users do not find this debugging mode very helpful. How are we to understand what has broken in a script that used to work?
407 408
408 An example helps demonstrate a useful approach. Consider what would have happened in our proof of [reassoc_correct] if we had first added an unfortunate rewriting hint. *) 409 An example helps demonstrate a useful approach. Consider what would have happened in our proof of [reassoc_correct] if we had first added an unfortunate rewriting hint. *)
409 410
410 Reset reassoc_correct. 411 Reset reassoc_correct.
411 412
426 (** One subgoal remains: 427 (** One subgoal remains:
427 428
428 [[ 429 [[
429 ============================ 430 ============================
430 eval e1 * (eval e3 + 1 - 1) * eval e4 = eval e1 * eval e2 431 eval e1 * (eval e3 + 1 - 1) * eval e4 = eval e1 * eval e2
431
432 ]] 432 ]]
433 433
434 The poorly-chosen rewrite rule fired, changing the goal to a form where another hint no longer applies. Imagine that we are in the middle of a large development with many hints. How would we diagnose the problem? First, we might not be sure which case of the inductive proof has gone wrong. It is useful to separate out our automation procedure and apply it manually. *) 434 The poorly chosen rewrite rule fired, changing the goal to a form where another hint no longer applies. Imagine that we are in the middle of a large development with many hints. How would we diagnose the problem? First, we might not be sure which case of the inductive proof has gone wrong. It is useful to separate out our automation procedure and apply it manually. *)
435 435
436 Restart. 436 Restart.
437 437
438 Ltac t := crush; match goal with 438 Ltac t := crush; match goal with
439 | [ |- context[match ?E with Const _ => _ | Plus _ _ => _ 439 | [ |- context[match ?E with Const _ => _ | Plus _ _ => _
453 453
454 (** The final subgoal is for multiplication, and it is here that we get stuck in the proof state summarized above. *) 454 (** The final subgoal is for multiplication, and it is here that we get stuck in the proof state summarized above. *)
455 455
456 t. 456 t.
457 457
458 (** What is [t] doing to get us to this point? The [info] command can help us answer this kind of question. *) 458 (** What is [t] doing to get us to this point? The %\index{tactics!info}%[info] command can help us answer this kind of question. *)
459 459
460 (** remove printing * *) 460 (** remove printing * *)
461 Undo. 461 Undo.
462 info t. 462 info t.
463 (** [[ 463 (** %\vspace{-.15in}%[[
464 == simpl in *; intuition; subst; autorewrite with cpdt in *; 464 == simpl in *; intuition; subst; autorewrite with cpdt in *;
465 simpl in *; intuition; subst; autorewrite with cpdt in *; 465 simpl in *; intuition; subst; autorewrite with cpdt in *;
466 simpl in *; intuition; subst; destruct (reassoc e2). 466 simpl in *; intuition; subst; destruct (reassoc e2).
467 simpl in *; intuition. 467 simpl in *; intuition.
468 468
506 (** It was the final of these four tactics that made the rewrite. We can find out exactly what happened. The [info] command presents hierarchical views of proof steps, and we can zoom down to a lower level of detail by applying [info] to one of the steps that appeared in the original trace. *) 506 (** It was the final of these four tactics that made the rewrite. We can find out exactly what happened. The [info] command presents hierarchical views of proof steps, and we can zoom down to a lower level of detail by applying [info] to one of the steps that appeared in the original trace. *)
507 507
508 Undo. 508 Undo.
509 509
510 info autorewrite with cpdt in *. 510 info autorewrite with cpdt in *.
511 (** [[ 511 (** %\vspace{-.15in}%[[
512 == refine (eq_ind_r (fun n : nat => n = eval e1 * eval e2) _ 512 == refine (eq_ind_r (fun n : nat => n = eval e1 * eval e2) _
513 (confounder (reassoc e1) e3 e4)). 513 (confounder (reassoc e1) e3 e4)).
514
515 ]] 514 ]]
516 515
517 The way a rewrite is displayed is somewhat baroque, but we can see that theorem [confounder] is the final culprit. At this point, we could remove that hint, prove an alternate version of the key lemma [rewr], or come up with some other remedy. Fixing this kind of problem tends to be relatively easy once the problem is revealed. *) 516 The way a rewrite is displayed is somewhat baroque, but we can see that theorem [confounder] is the final culprit. At this point, we could remove that hint, prove an alternate version of the key lemma [rewr], or come up with some other remedy. Fixing this kind of problem tends to be relatively easy once the problem is revealed. *)
518 517
519 Abort. 518 Abort.
534 Variable f : A -> A. 533 Variable f : A -> A.
535 534
536 Hypothesis H1 : forall x y, P x y -> Q x y -> R x y -> f x = f y. 535 Hypothesis H1 : forall x y, P x y -> Q x y -> R x y -> f x = f y.
537 Hypothesis H2 : forall x y, S x y -> R x y. 536 Hypothesis H2 : forall x y, S x y -> R x y.
538 537
539 (** We prove a simple lemma very quickly, using the [Time] command to measure exactly how quickly. *) 538 (** We prove a simple lemma very quickly, using the %\index{Vernacular commands!Time}%[Time] command to measure exactly how quickly. *)
540 539
541 Lemma slow : forall x y, P x y -> Q x y -> S x y -> f x = f y. 540 Lemma slow : forall x y, P x y -> Q x y -> S x y -> f x = f y.
542 Time eauto 6. 541 Time eauto 6.
543 (** [[ 542 (** %\vspace{-.2in}%[[
544 Finished transaction in 0. secs (0.068004u,0.s) 543 Finished transaction in 0. secs (0.068004u,0.s)
545 ]] 544 ]]
546 *) 545 *)
547 546
548 Qed. 547 Qed.
551 550
552 Hypothesis H3 : forall x y, x = y -> f x = f y. 551 Hypothesis H3 : forall x y, x = y -> f x = f y.
553 552
554 Lemma slow' : forall x y, P x y -> Q x y -> S x y -> f x = f y. 553 Lemma slow' : forall x y, P x y -> Q x y -> S x y -> f x = f y.
555 Time eauto 6. 554 Time eauto 6.
556 (** [[ 555 (** %\vspace{-.2in}%[[
557 Finished transaction in 2. secs (1.264079u,0.s) 556 Finished transaction in 2. secs (1.264079u,0.s)
558
559 ]] 557 ]]
560 558
561 Why has the search time gone up so much? The [info] command is not much help, since it only shows the result of search, not all of the paths that turned out to be worthless. *) 559 Why has the search time gone up so much? The [info] command is not much help, since it only shows the result of search, not all of the paths that turned out to be worthless. *)
562 560
563 Restart. 561 Restart.
564 info eauto 6. 562 info eauto 6.
565 (** [[ 563 (** %\vspace{-.15in}%[[
566 == intro x; intro y; intro H; intro H0; intro H4; 564 == intro x; intro y; intro H; intro H0; intro H4;
567 simple eapply trans_eq. 565 simple eapply trans_eq.
568 simple apply refl_equal. 566 simple apply refl_equal.
569 567
570 simple eapply trans_eq. 568 simple eapply trans_eq.
577 eexact H. 575 eexact H.
578 576
579 eexact H0. 577 eexact H0.
580 578
581 simple apply H2; eexact H4. 579 simple apply H2; eexact H4.
582
583 ]] 580 ]]
584 581
585 This output does not tell us why proof search takes so long, but it does provide a clue that would be useful if we had forgotten that we added transitivity as a hint. The [eauto] tactic is applying depth-first search, and the proof script where the real action is ends up buried inside a chain of pointless invocations of transitivity, where each invocation uses reflexivity to discharge one subgoal. Each increment to the depth argument to [eauto] adds another silly use of transitivity. This wasted proof effort only adds linear time overhead, as long as proof search never makes false steps. No false steps were made before we added the new hypothesis, but somehow the addition made possible a new faulty path. To understand which paths we enabled, we can use the [debug] command. *) 582 This output does not tell us why proof search takes so long, but it does provide a clue that would be useful if we had forgotten that we added transitivity as a hint. The [eauto] tactic is applying depth-first search, and the proof script where the real action is ends up buried inside a chain of pointless invocations of transitivity, where each invocation uses reflexivity to discharge one subgoal. Each increment to the depth argument to [eauto] adds another silly use of transitivity. This wasted proof effort only adds linear time overhead, as long as proof search never makes false steps. No false steps were made before we added the new hypothesis, but somehow the addition made possible a new faulty path. To understand which paths we enabled, we can use the %\index{tactics!debug}%[debug] command. *)
586 583
587 Restart. 584 Restart.
588 debug eauto 6. 585 debug eauto 6.
589 586
590 (** The output is a large proof tree. The beginning of the tree is enough to reveal what is happening: 587 (** The output is a large proof tree. The beginning of the tree is enough to reveal what is happening:
591
592 [[ 588 [[
593 1 depth=6 589 1 depth=6
594 1.1 depth=6 intro 590 1.1 depth=6 intro
595 1.1.1 depth=6 intro 591 1.1.1 depth=6 intro
596 1.1.1.1 depth=6 intro 592 1.1.1.1 depth=6 intro
614 1.1.1.1.1.1.1.1.1.1.1.1.2.1.1 depth=1 apply refl_equal 610 1.1.1.1.1.1.1.1.1.1.1.1.2.1.1 depth=1 apply refl_equal
615 1.1.1.1.1.1.1.1.1.1.1.1.2.1.1.1 depth=0 eapply trans_eq 611 1.1.1.1.1.1.1.1.1.1.1.1.2.1.1.1 depth=0 eapply trans_eq
616 1.1.1.1.1.1.1.1.1.1.1.1.2.1.2 depth=1 apply sym_eq ; trivial 612 1.1.1.1.1.1.1.1.1.1.1.1.2.1.2 depth=1 apply sym_eq ; trivial
617 1.1.1.1.1.1.1.1.1.1.1.1.2.1.2.1 depth=0 eapply trans_eq 613 1.1.1.1.1.1.1.1.1.1.1.1.2.1.2.1 depth=0 eapply trans_eq
618 1.1.1.1.1.1.1.1.1.1.1.1.2.1.3 depth=0 eapply trans_eq 614 1.1.1.1.1.1.1.1.1.1.1.1.2.1.3 depth=0 eapply trans_eq
619
620 ]] 615 ]]
621 616
622 The first choice [eauto] makes is to apply [H3], since [H3] has the fewest hypotheses of all of the hypotheses and hints that match. However, it turns out that the single hypothesis generated is unprovable. That does not stop [eauto] from trying to prove it with an exponentially-sized tree of applications of transitivity, reflexivity, and symmetry of equality. It is the children of the initial [apply H3] that account for all of the noticeable time in proof execution. In a more realistic development, we might use this output of [info] to realize that adding transitivity as a hint was a bad idea. *) 617 The first choice [eauto] makes is to apply [H3], since [H3] has the fewest hypotheses of all of the hypotheses and hints that match. However, it turns out that the single hypothesis generated is unprovable. That does not stop [eauto] from trying to prove it with an exponentially sized tree of applications of transitivity, reflexivity, and symmetry of equality. It is the children of the initial [apply H3] that account for all of the noticeable time in proof execution. In a more realistic development, we might use this output of [debug] to realize that adding transitivity as a hint was a bad idea. *)
623 618
624 Qed. 619 Qed.
625 End slow. 620 End slow.
626 621
627 (** It is also easy to end up with a proof script that uses too much memory. As tactics run, they avoid generating proof terms, since serious proof search will consider many possible avenues, and we do not want to built proof terms for subproofs that end up unused. Instead, tactic execution maintains %\textit{%#<i>#thunks#</i>#%}% (suspended computations, represented with closures), such that a tactic's proof-producing thunk is only executed when we run [Qed]. These thunks can use up large amounts of space, such that a proof script exhausts available memory, even when we know that we could have used much less memory by forcing some thunks earlier. 622 (** It is also easy to end up with a proof script that uses too much memory. As tactics run, they avoid generating proof terms, since serious proof search will consider many possible avenues, and we do not want to build proof terms for subproofs that end up unused. Instead, tactic execution maintains %\index{thunks}\textit{%#<i>#thunks#</i>#%}% (suspended computations, represented with closures), such that a tactic's proof-producing thunk is only executed when we run [Qed]. These thunks can use up large amounts of space, such that a proof script exhausts available memory, even when we know that we could have used much less memory by forcing some thunks earlier.
628 623
629 The [abstract] tactical helps us force thunks by proving some subgoals as their own lemmas. For instance, a proof [induction x; crush] can in many cases be made to use significantly less peak memory by changing it to [induction x; abstract crush]. The main limitation of [abstract] is that it can only be applied to subgoals that are proved completely, with no undetermined unification variables remaining. Still, many large automated proofs can realize vast memory savings via [abstract]. *) 624 The %\index{tactics!abstract}%[abstract] tactical helps us force thunks by proving some subgoals as their own lemmas. For instance, a proof [induction x; crush] can in many cases be made to use significantly less peak memory by changing it to [induction x; abstract crush]. The main limitation of [abstract] is that it can only be applied to subgoals that are proved completely, with no undetermined unification variables remaining. Still, many large automated proofs can realize vast memory savings via [abstract]. *)
630 625
631 626
632 (** * Modules *) 627 (** * Modules *)
633 628
634 (** Last chapter's examples of proof by reflection demonstrate opportunities for implementing abstract proof strategies with stronger formal guarantees than can be had with Ltac scripting. Coq's %\textit{%#<i>#module system#</i>#%}% provides another tool for more rigorous development of generic theorems. This feature is inspired by the module systems found in Standard ML and Objective Caml, and the discussion that follows assumes familiarity with the basics of one of those systems. 629 (** Last chapter's examples of proof by reflection demonstrate opportunities for implementing abstract proof strategies with stronger formal guarantees than can be had with Ltac scripting. Coq's %\textit{%#<i>#module system#</i>#%}% provides another tool for more rigorous development of generic theorems. This feature is inspired by the module systems found in Standard ML%~\cite{modules}% and Objective Caml, and the discussion that follows assumes familiarity with the basics of one of those systems.
635 630
636 ML modules facilitate the grouping of abstract types with operations over those types. Moreover, there is support for %\textit{%#<i>#functors#</i>#%}%, which are functions from modules to modules. A canonical example of a functor is one that builds a data structure implementation from a module that describes a domain of keys and its associated comparison operations. 631 ML modules facilitate the grouping of %\index{abstract type}%abstract types with operations over those types. Moreover, there is support for %\index{functor}\textit{%#<i>#functors#</i>#%}%, which are functions from modules to modules. A canonical example of a functor is one that builds a data structure implementation from a module that describes a domain of keys and its associated comparison operations.
637 632
638 When we add modules to a base language with dependent types, it becomes possible to use modules and functors to formalize kinds of reasoning that are common in algebra. For instance, this module signature captures the essence of the algebraic structure known as a group. A group consists of a carrier set [G], an associative binary operation [f], a left identity element [e] for [f], and an operation [i] that is a left inverse for [f]. *) 633 When we add modules to a base language with dependent types, it becomes possible to use modules and functors to formalize kinds of reasoning that are common in algebra. For instance, this module signature captures the essence of the algebraic structure known as a group. A group consists of a carrier set [G], an associative binary operation [f], a left identity element [e] for [f], and an operation [i] that is a left inverse for [f].%\index{Vernacular commands!Module Type}% *)
639 634
640 Module Type GROUP. 635 Module Type GROUP.
641 Parameter G : Set. 636 Parameter G : Set.
642 Parameter f : G -> G -> G. 637 Parameter f : G -> G -> G.
643 Parameter e : G. 638 Parameter e : G.
646 Axiom assoc : forall a b c, f (f a b) c = f a (f b c). 641 Axiom assoc : forall a b c, f (f a b) c = f a (f b c).
647 Axiom ident : forall a, f e a = a. 642 Axiom ident : forall a, f e a = a.
648 Axiom inverse : forall a, f (i a) a = e. 643 Axiom inverse : forall a, f (i a) a = e.
649 End GROUP. 644 End GROUP.
650 645
651 (** Many useful theorems hold of arbitrary groups. We capture some such theorem statements in another module signature. *) 646 (** Many useful theorems hold of arbitrary groups. We capture some such theorem statements in another module signature.%\index{Vernacular commands!Declare Module}% *)
652 647
653 Module Type GROUP_THEOREMS. 648 Module Type GROUP_THEOREMS.
654 Declare Module M : GROUP. 649 Declare Module M : GROUP.
655 650
656 Axiom ident' : forall a, M.f a M.e = a. 651 Axiom ident' : forall a, M.f a M.e = a.
658 Axiom inverse' : forall a, M.f a (M.i a) = M.e. 653 Axiom inverse' : forall a, M.f a (M.i a) = M.e.
659 654
660 Axiom unique_ident : forall e', (forall a, M.f e' a = a) -> e' = M.e. 655 Axiom unique_ident : forall e', (forall a, M.f e' a = a) -> e' = M.e.
661 End GROUP_THEOREMS. 656 End GROUP_THEOREMS.
662 657
663 (** We implement generic proofs of these theorems with a functor, whose input is an arbitrary group [M]. The proofs are completely manual, since it would take some effort to build suitable generic automation; rather, these theorems can serve as a basis for an automated procedure for simplifying group expressions, along the lines of the procedure for monoids from the last chapter. We take the proofs from the Wikipedia page on elementary group theory. *) 658 (** We implement generic proofs of these theorems with a functor, whose input is an arbitrary group [M]. The proofs are completely manual, since it would take some effort to build suitable generic automation; rather, these theorems can serve as a basis for an automated procedure for simplifying group expressions, along the lines of the procedure for monoids from the last chapter. We take the proofs from the Wikipedia page on elementary group theory.%\index{Vernacular commands!Module}% *)
664 659
665 Module Group (M : GROUP) : GROUP_THEOREMS with Module M := M. 660 Module Group (M : GROUP) : GROUP_THEOREMS with Module M := M.
666 Module M := M. 661 Module M := M.
667 662
668 Import M. 663 Import M.
724 Check IntTheorems.unique_ident. 719 Check IntTheorems.unique_ident.
725 (** %\vspace{-.15in}% [[ 720 (** %\vspace{-.15in}% [[
726 IntTheorems.unique_ident 721 IntTheorems.unique_ident
727 : forall e' : Int.G, (forall a : Int.G, Int.f e' a = a) -> e' = Int.e 722 : forall e' : Int.G, (forall a : Int.G, Int.f e' a = a) -> e' = Int.e
728 ]] 723 ]]
729 *) 724
725 Projections like [Int.G] are known to be definitionally equal to the concrete values we have assigned to them, so the above theorem yields as a trivial corollary the following more natural restatement: *)
730 726
731 Theorem unique_ident : forall e', (forall a, e' + a = a) -> e' = 0. 727 Theorem unique_ident : forall e', (forall a, e' + a = a) -> e' = 0.
732 exact IntTheorems.unique_ident. 728 exact IntTheorems.unique_ident.
733 Qed. 729 Qed.
734 730
735 (** As in ML, the module system provides an effective way to structure large developments. Unlike in ML, Coq modules add no expressiveness; we can implement any module as an inhabitant of a dependent record type. It is the second-class nature of modules that makes them easier to use than dependent records in many case. Because modules may only be used in quite restricted ways, it is easier to support convenient module coding through special commands and editing modes, as the above example demonstrates. An isomorphic implementation with records would have suffered from lack of such conveniences as module subtyping and importation of the fields of a module. *) 731 (** As in ML, the module system provides an effective way to structure large developments. Unlike in ML, Coq modules add no expressiveness; we can implement any module as an inhabitant of a dependent record type. It is the second-class nature of modules that makes them easier to use than dependent records in many case. Because modules may only be used in quite restricted ways, it is easier to support convenient module coding through special commands and editing modes, as the above example demonstrates. An isomorphic implementation with records would have suffered from lack of such conveniences as module subtyping and importation of the fields of a module. On the other hand, all module values must be determined statically, so modules may not be computed, e.g., within the defintions of normal functions, based on particular function parameters. *)
736 732
737 733
738 (** * Build Processes *) 734 (** * Build Processes *)
739 735
740 (** As in software development, large Coq projects are much more manageable when split across multiple files and when decomposed into libraries. Coq and Proof General provide very good support for these activities. 736 (** As in software development, large Coq projects are much more manageable when split across multiple files and when decomposed into libraries. Coq and Proof General provide very good support for these activities.
741 737
742 Consider a library that we will name [Lib], housed in directory %\texttt{%#<tt>#LIB#</tt>#%}% and split between files %\texttt{%#<tt>#A.v#</tt>#%}%, %\texttt{%#<tt>#B.v#</tt>#%}%, and %\texttt{%#<tt>#C.v#</tt>#%}%. A simple Makefile will compile the library, relying on the standard Coq tool %\texttt{%#<tt>#coq_makefile#</tt>#%}% to do the hard work. 738 Consider a library that we will name [Lib], housed in directory %\texttt{%#<tt>#LIB#</tt>#%}% and split between files %\texttt{%#<tt>#A.v#</tt>#%}%, %\texttt{%#<tt>#B.v#</tt>#%}%, and %\texttt{%#<tt>#C.v#</tt>#%}%. A simple %\index{Makefile}%Makefile will compile the library, relying on the standard Coq tool %\index{coq\_makefile}\texttt{%#<tt>#coq_makefile#</tt>#%}% to do the hard work.
743 739
744 << 740 <<
745 MODULES := A B C 741 MODULES := A B C
746 VS := $(MODULES:%=%.v) 742 VS := $(MODULES:%=%.v)
747 743
759 >> 755 >>
760 756
761 The Makefile begins by defining a variable %\texttt{%#<tt>#VS#</tt>#%}% holding the list of filenames to be included in the project. The primary target is %\texttt{%#<tt>#coq#</tt>#%}%, which depends on the construction of an auxiliary Makefile called %\texttt{%#<tt>#Makefile.coq#</tt>#%}%. Another rule explains how to build that file. We call %\texttt{%#<tt>#coq_makefile#</tt>#%}%, using the %\texttt{%#<tt>#-R#</tt>#%}% flag to specify that files in the current directory should be considered to belong to the library [Lib]. This Makefile will build a compiled version of each module, such that %\texttt{%#<tt>#X.v#</tt>#%}% is compiled into %\texttt{%#<tt>#X.vo#</tt>#%}%. 757 The Makefile begins by defining a variable %\texttt{%#<tt>#VS#</tt>#%}% holding the list of filenames to be included in the project. The primary target is %\texttt{%#<tt>#coq#</tt>#%}%, which depends on the construction of an auxiliary Makefile called %\texttt{%#<tt>#Makefile.coq#</tt>#%}%. Another rule explains how to build that file. We call %\texttt{%#<tt>#coq_makefile#</tt>#%}%, using the %\texttt{%#<tt>#-R#</tt>#%}% flag to specify that files in the current directory should be considered to belong to the library [Lib]. This Makefile will build a compiled version of each module, such that %\texttt{%#<tt>#X.v#</tt>#%}% is compiled into %\texttt{%#<tt>#X.vo#</tt>#%}%.
762 758
763 Now code in %\texttt{%#<tt>#B.v#</tt>#%}% may refer to definitions in %\texttt{%#<tt>#A.v#</tt>#%}% after running 759 Now code in %\texttt{%#<tt>#B.v#</tt>#%}% may refer to definitions in %\texttt{%#<tt>#A.v#</tt>#%}% after running
764
765 [[ 760 [[
766 Require Import Lib.A. 761 Require Import Lib.A.
767
768 ]] 762 ]]
769 763 %\vspace{-.15in}%Library [Lib] is presented as a module, containing a submodule [A], which contains the definitions from %\texttt{%#<tt>#A.v#</tt>#%}%. These are genuine modules in the sense of Coq's module system, and they may be passed to functors and so on.
770 Library [Lib] is presented as a module, containing a submodule [A], which contains the definitions from %\texttt{%#<tt>#A.v#</tt>#%}%. These are genuine modules in the sense of Coq's module system, and they may be passed to functors and so on. 764
771 765 The command [Require Import] is a convenient combination of two more primitive commands. The %\index{Vernacular commands!Require}%[Require] command finds the %\texttt{%#<tt>#.vo#</tt>#%}% file containing the named module, ensuring that the module is loaded into memory. The %\index{Vernacular commands!Import}%[Import] command loads all top-level definitions of the named module into the current namespace, and it may be used with local modules that do not have corresponding %\texttt{%#<tt>#.vo#</tt>#%}% files. Another command, %\index{Vernacular commands!Load}%[Load], is for inserting the contents of a named file verbatim. It is generally better to use the module-based commands, since they avoid rerunning proof scripts, and they facilitate reorganization of directory structure without the need to change code.
772 [Require Import] is a convenient combination of two more primitive commands. [Require] finds the %\texttt{%#<tt>#.vo#</tt>#%}% file containing the named module, ensuring that the module is loaded into memory. [Import] loads all top-level definitions of the named module into the current namespace, and it may be used with local modules that do not have corresponding %\texttt{%#<tt>#.vo#</tt>#%}% files. Another command, [Load], is for inserting the contents of a named file verbatim. It is generally better to use the module-based commands, since they avoid rerunning proof scripts, and they facilitate reorganization of directory structure without the need to change code.
773 766
774 Now we would like to use our library from a different development, called [Client] and found in directory %\texttt{%#<tt>#CLIENT#</tt>#%}%, which has its own Makefile. 767 Now we would like to use our library from a different development, called [Client] and found in directory %\texttt{%#<tt>#CLIENT#</tt>#%}%, which has its own Makefile.
775 768
776 << 769 <<
777 MODULES := D E 770 MODULES := D E
789 make -f Makefile.coq clean 782 make -f Makefile.coq clean
790 rm -f Makefile.coq 783 rm -f Makefile.coq
791 >> 784 >>
792 785
793 We change the %\texttt{%#<tt>#coq_makefile#</tt>#%}% call to indicate where the library [Lib] is found. Now %\texttt{%#<tt>#D.v#</tt>#%}% and %\texttt{%#<tt>#E.v#</tt>#%}% can refer to definitions from [Lib] module [A] after running 786 We change the %\texttt{%#<tt>#coq_makefile#</tt>#%}% call to indicate where the library [Lib] is found. Now %\texttt{%#<tt>#D.v#</tt>#%}% and %\texttt{%#<tt>#E.v#</tt>#%}% can refer to definitions from [Lib] module [A] after running
794
795 [[ 787 [[
796 Require Import Lib.A. 788 Require Import Lib.A.
797
798 ]] 789 ]]
799 790 %\vspace{-.15in}\noindent{}%and %\texttt{%#<tt>#E.v#</tt>#%}% can refer to definitions from %\texttt{%#<tt>#D.v#</tt>#%}% by running
800 and %\texttt{%#<tt>#E.v#</tt>#%}% can refer to definitions from %\texttt{%#<tt>#D.v#</tt>#%}% by running
801
802 [[ 791 [[
803 Require Import Client.D. 792 Require Import Client.D.
804
805 ]] 793 ]]
806 794 %\vspace{-.15in}%It can be useful to split a library into several files, but it is also inconvenient for client code to import library modules individually. We can get the best of both worlds by, for example, adding an extra source file %\texttt{%#<tt>#Lib.v#</tt>#%}% to [Lib]'s directory and Makefile, where that file contains just this line:%\index{Vernacular commands!Require Export}%
807 It can be useful to split a library into several files, but it is also inconvenient for client code to import library modules individually. We can get the best of both worlds by, for example, adding an extra source file %\texttt{%#<tt>#Lib.v#</tt>#%}% to [Lib]'s directory and Makefile.
808
809 [[ 795 [[
810 Require Export Lib.A Lib.B Lib.C. 796 Require Export Lib.A Lib.B Lib.C.
811
812 ]] 797 ]]
813 798 %\vspace{-.15in}%Now client code can import all definitions from all of [Lib]'s modules simply by running
814 Now client code can import all definitions from all of [Lib]'s modules simply by running
815
816 [[ 799 [[
817 Require Import Lib. 800 Require Import Lib.
818
819 ]] 801 ]]
820 802 %\vspace{-.15in}%The two Makefiles above share a lot of code, so, in practice, it is useful to define a common Makefile that is included by multiple library-specific Makefiles.
821 The two Makefiles above share a lot of code, so, in practice, it is useful to define a common Makefile that is included by multiple library-specific Makefiles.
822 803
823 %\medskip% 804 %\medskip%
824 805
825 The remaining ingredient is the proper way of editing library code files in Proof General. Recall this snippet of %\texttt{%#<tt>#.emacs#</tt>#%}% code from Chapter 2, which tells Proof General where to find the library associated with this book. 806 The remaining ingredient is the proper way of editing library code files in Proof General. Recall this snippet of %\texttt{%#<tt>#.emacs#</tt>#%}% code from Chapter 2, which tells Proof General where to find the library associated with this book.
826 807