Certified Programming with Dependent Types: src/Large.v annotate

annotate src/Large.v @ 241:cb3f3ef9d5bb

slow

author	Adam Chlipala <adamc@hcoop.net>
date	Wed, 09 Dec 2009 11:36:37 -0500
parents	b28c6e6eca0c
children	5a32784e30f3

rev	line source
adamc@235	1 (* Copyright (c) 2009, Adam Chlipala
adamc@235	2 *
adamc@235	3 * This work is licensed under a
adamc@235	4 * Creative Commons Attribution-Noncommercial-No Derivative Works 3.0
adamc@235	5 * Unported License.
adamc@235	6 * The license text is available at:
adamc@235	7 * http://creativecommons.org/licenses/by-nc-nd/3.0/
adamc@235	8 *)
adamc@235	9
adamc@235	10 (* begin hide *)
adamc@236	11 Require Import Arith.
adamc@236	12
adamc@235	13 Require Import Tactics.
adamc@235	14
adamc@235	15 Set Implicit Arguments.
adamc@235	16 (* end hide *)
adamc@235	17
adamc@235	18
adamc@235	19 (** %\chapter{Proving in the Large}% *)
adamc@235	20
adamc@236	21 (** It is somewhat unfortunate that the term "theorem-proving" looks so much like the word "theory." Most researchers and practitioners in software assume that mechanized theorem-proving is profoundly impractical. Indeed, until recently, most advances in theorem-proving for higher-order logics have been largely theoretical. However, starting around the beginning of the 21st century, there was a surge in the use of proof assistants in serious verification efforts. That line of work is still quite new, but I believe it is not too soon to distill some lessons on how to work effectively with large formal proofs.
adamc@236	22
adamc@236	23 Thus, this chapter gives some tips for structuring and maintaining large Coq developments. *)
adamc@236	24
adamc@236	25
adamc@236	26 (** * Ltac Anti-Patterns *)
adamc@236	27
adamc@237	28 (** In this book, I have been following an unusual style, where proofs are not considered finished until they are "fully automated," in a certain sense. SEach such theorem is proved by a single tactic. Since Ltac is a Turing-complete programming language, it is not hard to squeeze arbitrary heuristics into single tactics, using operators like the semicolon to combine steps. In contrast, most Ltac proofs "in the wild" consist of many steps, performed by individual tactics followed by periods. Is it really worth drawing a distinction between proof steps terminated by semicolons and steps terminated by periods?
adamc@236	29
adamc@237	30 I argue that this is, in fact, a very important distinction, with serious consequences for a majority of important verification domains. The more uninteresting drudge work a proof domain involves, the more important it is to work to prove theorems with single tactics. From an automation standpoint, single-tactic proofs can be extremely effective, and automation becomes more and more critical as proofs are populated by more uninteresting detail. In this section, I will give some examples of the consequences of more common proof styles.
adamc@236	31
adamc@236	32 As a running example, consider a basic language of arithmetic expressions, an interpreter for it, and a transformation that scales up every constant in an expression. *)
adamc@236	33
adamc@236	34 Inductive exp : Set :=
adamc@236	35 \| Const : nat -> exp
adamc@236	36 \| Plus : exp -> exp -> exp.
adamc@236	37
adamc@236	38 Fixpoint eval (e : exp) : nat :=
adamc@236	39 match e with
adamc@236	40 \| Const n => n
adamc@236	41 \| Plus e1 e2 => eval e1 + eval e2
adamc@236	42 end.
adamc@236	43
adamc@236	44 Fixpoint times (k : nat) (e : exp) : exp :=
adamc@236	45 match e with
adamc@236	46 \| Const n => Const (k * n)
adamc@236	47 \| Plus e1 e2 => Plus (times k e1) (times k e2)
adamc@236	48 end.
adamc@236	49
adamc@236	50 (** We can write a very manual proof that [double] really doubles an expression's value. *)
adamc@236	51
adamc@236	52 Theorem eval_times : forall k e,
adamc@236	53 eval (times k e) = k * eval e.
adamc@236	54 induction e.
adamc@236	55
adamc@236	56 trivial.
adamc@236	57
adamc@236	58 simpl.
adamc@236	59 rewrite IHe1.
adamc@236	60 rewrite IHe2.
adamc@236	61 rewrite mult_plus_distr_l.
adamc@236	62 trivial.
adamc@236	63 Qed.
adamc@236	64
adamc@236	65 (** We use spaces to separate the two inductive cases. The second case mentions automatically-generated hypothesis names explicitly. As a result, innocuous changes to the theorem statement can invalidate the proof. *)
adamc@236	66
adamc@236	67 Reset eval_times.
adamc@236	68
adamc@236	69 Theorem eval_double : forall k x,
adamc@236	70 eval (times k x) = k * eval x.
adamc@236	71 induction x.
adamc@236	72
adamc@236	73 trivial.
adamc@236	74
adamc@236	75 simpl.
adamc@236	76 (** [[
adamc@236	77 rewrite IHe1.
adamc@236	78
adamc@236	79 Error: The reference IHe1 was not found in the current environment.
adamc@236	80
adamc@236	81 ]]
adamc@236	82
adamc@236	83 The inductive hypotheses are named [IHx1] and [IHx2] now, not [IHe1] and [IHe2]. *)
adamc@236	84
adamc@236	85 Abort.
adamc@236	86
adamc@236	87 (** We might decide to use a more explicit invocation of [induction] to give explicit binders for all of the names that we will reference later in the proof. *)
adamc@236	88
adamc@236	89 Theorem eval_times : forall k e,
adamc@236	90 eval (times k e) = k * eval e.
adamc@236	91 induction e as [ \| ? IHe1 ? IHe2 ].
adamc@236	92
adamc@236	93 trivial.
adamc@236	94
adamc@236	95 simpl.
adamc@236	96 rewrite IHe1.
adamc@236	97 rewrite IHe2.
adamc@236	98 rewrite mult_plus_distr_l.
adamc@236	99 trivial.
adamc@236	100 Qed.
adamc@236	101
adamc@236	102 (** We pass [induction] an %\textit{%#<i>#intro pattern#</i>#%}%, using a [\|] character to separate out instructions for the different inductive cases. Within a case, we write [?] to ask Coq to generate a name automatically, and we write an explicit name to assign that name to the corresponding new variable. It is apparent that, to use intro patterns to avoid proof brittleness, one needs to keep track of the seemingly unimportant facts of the orders in which variables are introduced. Thus, the script keeps working if we replace [e] by [x], but it has become more cluttered. Arguably, neither proof is particularly easy to follow.
adamc@236	103
adamc@237	104 That category of complaint has to do with understanding proofs as static artifacts. As with programming in general, with serious projects, it tends to be much more important to be able to support evolution of proofs as specifications change. Unstructured proofs like the above examples can be very hard to update in concert with theorem statements. For instance, consider how the last proof script plays out when we modify [times] to introduce a bug. *)
adamc@236	105
adamc@236	106 Reset times.
adamc@236	107
adamc@236	108 Fixpoint times (k : nat) (e : exp) : exp :=
adamc@236	109 match e with
adamc@236	110 \| Const n => Const (1 + k * n)
adamc@236	111 \| Plus e1 e2 => Plus (times k e1) (times k e2)
adamc@236	112 end.
adamc@236	113
adamc@236	114 Theorem eval_times : forall k e,
adamc@236	115 eval (times k e) = k * eval e.
adamc@236	116 induction e as [ \| ? IHe1 ? IHe2 ].
adamc@236	117
adamc@236	118 trivial.
adamc@236	119
adamc@236	120 simpl.
adamc@236	121 (** [[
adamc@236	122 rewrite IHe1.
adamc@236	123
adamc@236	124 Error: The reference IHe1 was not found in the current environment.
adamc@236	125
adamc@236	126 ]] *)
adamc@236	127
adamc@236	128 Abort.
adamc@236	129
adamc@237	130 (** Can you spot what went wrong, without stepping through the script step-by-step? The problem is that [trivial] never fails. Originally, [trivial] had been succeeding in proving an equality that follows by reflexivity. Our change to [times] leads to a case where that equality is no longer true. [trivial] happily leaves the false equality in place, and we continue on to the span of tactics intended for the second inductive case. Unfortunately, those tactics end up being applied to the %\textit{%#<i>#first#</i>#%}% case instead.
adamc@237	131
adamc@237	132 The problem with [trivial] could be "solved" by writing [solve [trivial]] instead, so that an error is signaled early on if something unexpected happens. However, the root problem is that the syntax of a tactic invocation does not imply how many subgoals it produces. Much more confusing instances of this problem are possible. For example, if a lemma [L] is modified to take an extra hypothesis, then uses of [apply L] will general more subgoals than before. Old unstructured proof scripts will become hopelessly jumbled, with tactics applied to inappropriate subgoals. Because of the lack of structure, there is usually relatively little to be gleaned from knowledge of the precise point in a proof script where an error is raised. *)
adamc@236	133
adamc@236	134 Reset times.
adamc@236	135
adamc@236	136 Fixpoint times (k : nat) (e : exp) : exp :=
adamc@236	137 match e with
adamc@236	138 \| Const n => Const (k * n)
adamc@236	139 \| Plus e1 e2 => Plus (times k e1) (times k e2)
adamc@236	140 end.
adamc@236	141
adamc@237	142 (** Many real developments try to make essentially unstructured proofs look structured by applying careful indentation conventions, idempotent case-marker tactics included soley to serve as documentation, and so on. All of these strategies suffer from the same kind of failure of abstraction that was just demonstrated. I like to say that if you find yourself caring about indentation in a proof script, it is a sign that the script is structured poorly.
adamc@236	143
adamc@236	144 We can rewrite the current proof with a single tactic. *)
adamc@236	145
adamc@236	146 Theorem eval_times : forall k e,
adamc@236	147 eval (times k e) = k * eval e.
adamc@236	148 induction e as [ \| ? IHe1 ? IHe2 ]; [
adamc@236	149 trivial
adamc@236	150 \| simpl; rewrite IHe1; rewrite IHe2; rewrite mult_plus_distr_l; trivial ].
adamc@236	151 Qed.
adamc@236	152
adamc@236	153 (** This is an improvement in robustness of the script. We no longer need to worry about tactics from one case being applied to a different case. Still, the proof script is not especially readable. Probably most readers would not find it helpful in explaining why the theorem is true.
adamc@236	154
adamc@236	155 The situation gets worse in considering extensions to the theorem we want to prove. Let us add multiplication nodes to our [exp] type and see how the proof fares. *)
adamc@236	156
adamc@236	157 Reset exp.
adamc@236	158
adamc@236	159 Inductive exp : Set :=
adamc@236	160 \| Const : nat -> exp
adamc@236	161 \| Plus : exp -> exp -> exp
adamc@236	162 \| Mult : exp -> exp -> exp.
adamc@236	163
adamc@236	164 Fixpoint eval (e : exp) : nat :=
adamc@236	165 match e with
adamc@236	166 \| Const n => n
adamc@236	167 \| Plus e1 e2 => eval e1 + eval e2
adamc@236	168 \| Mult e1 e2 => eval e1 * eval e2
adamc@236	169 end.
adamc@236	170
adamc@236	171 Fixpoint times (k : nat) (e : exp) : exp :=
adamc@236	172 match e with
adamc@236	173 \| Const n => Const (k * n)
adamc@236	174 \| Plus e1 e2 => Plus (times k e1) (times k e2)
adamc@236	175 \| Mult e1 e2 => Mult (times k e1) e2
adamc@236	176 end.
adamc@236	177
adamc@236	178 Theorem eval_times : forall k e,
adamc@236	179 eval (times k e) = k * eval e.
adamc@236	180 (** [[
adamc@236	181 induction e as [ \| ? IHe1 ? IHe2 ]; [
adamc@236	182 trivial
adamc@236	183 \| simpl; rewrite IHe1; rewrite IHe2; rewrite mult_plus_distr_l; trivial ].
adamc@236	184
adamc@236	185 Error: Expects a disjunctive pattern with 3 branches.
adamc@236	186
adamc@236	187 ]] *)
adamc@236	188
adamc@236	189 Abort.
adamc@236	190
adamc@236	191 (** Unsurprisingly, the old proof fails, because it explicitly says that there are two inductive cases. To update the script, we must, at a minimum, remember the order in which the inductive cases are generated, so that we can insert the new case in the appropriate place. Even then, it will be painful to add the case, because we cannot walk through proof steps interactively when they occur inside an explicit set of cases. *)
adamc@236	192
adamc@236	193 Theorem eval_times : forall k e,
adamc@236	194 eval (times k e) = k * eval e.
adamc@236	195 induction e as [ \| ? IHe1 ? IHe2 \| ? IHe1 ? IHe2 ]; [
adamc@236	196 trivial
adamc@236	197 \| simpl; rewrite IHe1; rewrite IHe2; rewrite mult_plus_distr_l; trivial
adamc@236	198 \| simpl; rewrite IHe1; rewrite mult_assoc; trivial ].
adamc@236	199 Qed.
adamc@236	200
adamc@236	201 (** Now we are in a position to see how much nicer is the style of proof that we have followed in most of this book. *)
adamc@236	202
adamc@236	203 Reset eval_times.
adamc@236	204
adamc@238	205 Hint Rewrite mult_plus_distr_l : cpdt.
adamc@238	206
adamc@236	207 Theorem eval_times : forall k e,
adamc@236	208 eval (times k e) = k * eval e.
adamc@236	209 induction e; crush.
adamc@236	210 Qed.
adamc@236	211
adamc@237	212 (** This style is motivated by a hard truth: one person's manual proof script is almost always mostly inscrutable to most everyone else. I claim that step-by-step formal proofs are a poor way of conveying information. Thus, we had might as well cut out the steps and automate as much as possible.
adamc@237	213
adamc@237	214 What about the illustrative value of proofs? Most informal proofs are read to convey the big ideas of proofs. How can reading [induction e; crush] convey any big ideas? My position is that any ideas that standard automation can find are not very big after all, and the %\textit{%#<i>#real#</i>#%}% big ideas should be expressed through lemmas that are added as hints.
adamc@237	215
adamc@237	216 An example should help illustrate what I mean. Consider this function, which rewrites an expression using associativity of addition and multiplication. *)
adamc@237	217
adamc@237	218 Fixpoint reassoc (e : exp) : exp :=
adamc@237	219 match e with
adamc@237	220 \| Const _ => e
adamc@237	221 \| Plus e1 e2 =>
adamc@237	222 let e1' := reassoc e1 in
adamc@237	223 let e2' := reassoc e2 in
adamc@237	224 match e2' with
adamc@237	225 \| Plus e21 e22 => Plus (Plus e1' e21) e22
adamc@237	226 \| _ => Plus e1' e2'
adamc@237	227 end
adamc@237	228 \| Mult e1 e2 =>
adamc@237	229 let e1' := reassoc e1 in
adamc@237	230 let e2' := reassoc e2 in
adamc@237	231 match e2' with
adamc@237	232 \| Mult e21 e22 => Mult (Mult e1' e21) e22
adamc@237	233 \| _ => Mult e1' e2'
adamc@237	234 end
adamc@237	235 end.
adamc@237	236
adamc@237	237 Theorem reassoc_correct : forall e, eval (reassoc e) = eval e.
adamc@237	238 induction e; crush;
adamc@237	239 match goal with
adamc@237	240 \| [ \|- context[match ?E with Const _ => _ \| Plus _ _ => _ \| Mult _ _ => _ end] ] =>
adamc@237	241 destruct E; crush
adamc@237	242 end.
adamc@237	243
adamc@237	244 (** One subgoal remains:
adamc@237	245 [[
adamc@237	246 IHe2 : eval e3 * eval e4 = eval e2
adamc@237	247 ============================
adamc@237	248 eval e1 * eval e3 * eval e4 = eval e1 * eval e2
adamc@237	249
adamc@237	250 ]]
adamc@237	251
adamc@237	252 [crush] does not know how to finish this goal. We could finish the proof manually. *)
adamc@237	253
adamc@237	254 rewrite <- IHe2; crush.
adamc@237	255
adamc@237	256 (** However, the proof would be easier to understand and maintain if we separated this insight into a separate lemma. *)
adamc@237	257
adamc@237	258 Abort.
adamc@237	259
adamc@237	260 Lemma rewr : forall a b c d, b * c = d -> a * b * c = a * d.
adamc@237	261 crush.
adamc@237	262 Qed.
adamc@237	263
adamc@237	264 Hint Resolve rewr.
adamc@237	265
adamc@237	266 Theorem reassoc_correct : forall e, eval (reassoc e) = eval e.
adamc@237	267 induction e; crush;
adamc@237	268 match goal with
adamc@237	269 \| [ \|- context[match ?E with Const _ => _ \| Plus _ _ => _ \| Mult _ _ => _ end] ] =>
adamc@237	270 destruct E; crush
adamc@237	271 end.
adamc@237	272 Qed.
adamc@237	273
adamc@237	274 (** In the limit, a complicated inductive proof might rely on one hint for each inductive case. The lemma for each hint could restate the associated case. Compared to manual proof scripts, we arrive at more readable results. Scripts no longer need to depend on the order in which cases are generated. The lemmas are easier to digest separately than are fragments of tactic code, since lemma statements include complete proof contexts. Such contexts can only be extracted from monolithic manual proofs by stepping through scripts interactively.
adamc@237	275
adamc@237	276 The more common situation is that a large induction has several easy cases that automation makes short work of. In the remaining cases, automation performs some standard simplification. Among these cases, some may require quite involved proofs; such a case may deserve a hint lemma of its own, where the lemma statement may copy the simplified version of the case. Alternatively, the proof script for the main theorem may be extended with some automation code targeted at the specific case. Even such targeted scripting is more desirable than manual proving, because it may be read and understood without knowledge of a proof's hierarchical structure, case ordering, or name binding structure. *)
adamc@237	277
adamc@235	278
adamc@238	279 (** * Debugging and Maintaining Automation *)
adamc@238	280
adamc@238	281 (** Fully-automated proofs are desirable because they open up possibilities for automatic adaptation to changes of specification. A well-engineered script within a narrow domain can survive many changes to the formulation of the problem it solves. Still, as we are working with higher-order logic, most theorems fall within no obvious decidable theories. It is inevitable that most long-lived automated proofs will need updating.
adamc@238	282
adamc@238	283 Before we are ready to update our proofs, we need to write them in the first place. While fully-automated scripts are most robust to changes of specification, it is hard to write every new proof directly in that form. Instead, it is useful to begin a theorem with exploratory proving and then gradually refine it into a suitable automated form.
adamc@238	284
adamc@238	285 Consider this theorem from Chapter 7, which we begin by proving in a mostly manual way, invoking [crush] after each steop to discharge any low-hanging fruit. Our manual effort involves choosing which expressions to case-analyze on. *)
adamc@238	286
adamc@238	287 (* begin hide *)
adamc@238	288 Require Import MoreDep.
adamc@238	289 (* end hide *)
adamc@238	290
adamc@238	291 Theorem cfold_correct : forall t (e : exp t), expDenote e = expDenote (cfold e).
adamc@238	292 induction e; crush.
adamc@238	293
adamc@238	294 dep_destruct (cfold e1); crush.
adamc@238	295 dep_destruct (cfold e2); crush.
adamc@238	296
adamc@238	297 dep_destruct (cfold e1); crush.
adamc@238	298 dep_destruct (cfold e2); crush.
adamc@238	299
adamc@238	300 dep_destruct (cfold e1); crush.
adamc@238	301 dep_destruct (cfold e2); crush.
adamc@238	302
adamc@238	303 dep_destruct (cfold e1); crush.
adamc@238	304 dep_destruct (expDenote e1); crush.
adamc@238	305
adamc@238	306 dep_destruct (cfold e); crush.
adamc@238	307
adamc@238	308 dep_destruct (cfold e); crush.
adamc@238	309 Qed.
adamc@238	310
adamc@238	311 (** In this complete proof, it is hard to avoid noticing a pattern. We rework the proof, abstracting over the patterns we find. *)
adamc@238	312
adamc@238	313 Reset cfold_correct.
adamc@238	314
adamc@238	315 Theorem cfold_correct : forall t (e : exp t), expDenote e = expDenote (cfold e).
adamc@238	316 induction e; crush.
adamc@238	317
adamc@238	318 (** The expression we want to destruct here turns out to be the discriminee of a [match], and we can easily enough write a tactic that destructs all such expressions. *)
adamc@238	319
adamc@238	320 Ltac t :=
adamc@238	321 repeat (match goal with
adamc@238	322 \| [ \|- context[match ?E with NConst _ => _ \| Plus _ _ => _
adamc@238	323 \| Eq _ _ => _ \| BConst _ => _ \| And _ _ => _
adamc@238	324 \| If _ _ _ _ => _ \| Pair _ _ _ _ => _
adamc@238	325 \| Fst _ _ _ => _ \| Snd _ _ _ => _ end] ] =>
adamc@238	326 dep_destruct E
adamc@238	327 end; crush).
adamc@238	328
adamc@238	329 t.
adamc@238	330
adamc@238	331 (** This tactic invocation discharges the whole case. It does the same on the next two cases, but it gets stuck on the fourth case. *)
adamc@238	332
adamc@238	333 t.
adamc@238	334
adamc@238	335 t.
adamc@238	336
adamc@238	337 t.
adamc@238	338
adamc@238	339 (** The subgoal's conclusion is:
adamc@238	340 [[
adamc@238	341 ============================
adamc@238	342 (if expDenote e1 then expDenote (cfold e2) else expDenote (cfold e3)) =
adamc@238	343 expDenote (if expDenote e1 then cfold e2 else cfold e3)
adamc@238	344
adamc@238	345 ]]
adamc@238	346
adamc@238	347 We need to expand our [t] tactic to handle this case. *)
adamc@238	348
adamc@238	349 Ltac t' :=
adamc@238	350 repeat (match goal with
adamc@238	351 \| [ \|- context[match ?E with NConst _ => _ \| Plus _ _ => _
adamc@238	352 \| Eq _ _ => _ \| BConst _ => _ \| And _ _ => _
adamc@238	353 \| If _ _ _ _ => _ \| Pair _ _ _ _ => _
adamc@238	354 \| Fst _ _ _ => _ \| Snd _ _ _ => _ end] ] =>
adamc@238	355 dep_destruct E
adamc@238	356 \| [ \|- (if ?E then _ else _) = _ ] => destruct E
adamc@238	357 end; crush).
adamc@238	358
adamc@238	359 t'.
adamc@238	360
adamc@238	361 (** Now the goal is discharged, but [t'] has no effect on the next subgoal. *)
adamc@238	362
adamc@238	363 t'.
adamc@238	364
adamc@238	365 (** A final revision of [t] finishes the proof. *)
adamc@238	366
adamc@238	367 Ltac t'' :=
adamc@238	368 repeat (match goal with
adamc@238	369 \| [ \|- context[match ?E with NConst _ => _ \| Plus _ _ => _
adamc@238	370 \| Eq _ _ => _ \| BConst _ => _ \| And _ _ => _
adamc@238	371 \| If _ _ _ _ => _ \| Pair _ _ _ _ => _
adamc@238	372 \| Fst _ _ _ => _ \| Snd _ _ _ => _ end] ] =>
adamc@238	373 dep_destruct E
adamc@238	374 \| [ \|- (if ?E then _ else _) = _ ] => destruct E
adamc@238	375 \| [ \|- context[match pairOut ?E with Some _ => _
adamc@238	376 \| None => _ end] ] =>
adamc@238	377 dep_destruct E
adamc@238	378 end; crush).
adamc@238	379
adamc@238	380 t''.
adamc@238	381
adamc@238	382 t''.
adamc@238	383 Qed.
adamc@238	384
adamc@238	385 (** We can take the final tactic and move it into the initial part of the proof script, arriving at a nicely-automated proof. *)
adamc@238	386
adamc@238	387 Reset t.
adamc@238	388
adamc@238	389 Theorem cfold_correct : forall t (e : exp t), expDenote e = expDenote (cfold e).
adamc@238	390 induction e; crush;
adamc@238	391 repeat (match goal with
adamc@238	392 \| [ \|- context[match ?E with NConst _ => _ \| Plus _ _ => _
adamc@238	393 \| Eq _ _ => _ \| BConst _ => _ \| And _ _ => _
adamc@238	394 \| If _ _ _ _ => _ \| Pair _ _ _ _ => _
adamc@238	395 \| Fst _ _ _ => _ \| Snd _ _ _ => _ end] ] =>
adamc@238	396 dep_destruct E
adamc@238	397 \| [ \|- (if ?E then _ else _) = _ ] => destruct E
adamc@238	398 \| [ \|- context[match pairOut ?E with Some _ => _
adamc@238	399 \| None => _ end] ] =>
adamc@238	400 dep_destruct E
adamc@238	401 end; crush).
adamc@238	402 Qed.
adamc@238	403
adamc@240	404 (** Even after we put together nice automated proofs, we must deal with specification changes that can invalidate them. It is not generally possible to step through single-tactic proofs interactively. There is a command [Debug On] that lets us step through points in tactic execution, but the debugger tends to make counterintuitive choices of which points we would like to stop at, and per-point output is quite verbose, so most Coq users do not find this debugging mode very helpful. How are we to understand what has broken in a script that used to work?
adamc@240	405
adamc@240	406 An example helps demonstrate a useful approach. Consider what would have happened in our proof of [reassoc_correct] if we had first added an unfortunate rewriting hint. *)
adamc@240	407
adamc@240	408 Reset reassoc_correct.
adamc@240	409
adamc@240	410 Theorem confounder : forall e1 e2 e3,
adamc@240	411 eval e1 * eval e2 * eval e3 = eval e1 * (eval e2 + 1 - 1) * eval e3.
adamc@240	412 crush.
adamc@240	413 Qed.
adamc@240	414
adamc@240	415 Hint Rewrite confounder : cpdt.
adamc@240	416
adamc@240	417 Theorem reassoc_correct : forall e, eval (reassoc e) = eval e.
adamc@240	418 induction e; crush;
adamc@240	419 match goal with
adamc@240	420 \| [ \|- context[match ?E with Const _ => _ \| Plus _ _ => _ \| Mult _ _ => _ end] ] =>
adamc@240	421 destruct E; crush
adamc@240	422 end.
adamc@240	423
adamc@240	424 (** One subgoal remains:
adamc@240	425
adamc@240	426 [[
adamc@240	427 ============================
adamc@240	428 eval e1 * (eval e3 + 1 - 1) * eval e4 = eval e1 * eval e2
adamc@240	429
adamc@240	430 ]]
adamc@240	431
adamc@240	432 The poorly-chosen rewrite rule fired, changing the goal to a form where another hint no longer applies. Imagine that we are in the middle of a large development with many hints. How would we diagnose the problem? First, we might not be sure which case of the inductive proof has gone wrong. It is useful to separate out our automation procedure and apply it manually. *)
adamc@240	433
adamc@240	434 Restart.
adamc@240	435
adamc@240	436 Ltac t := crush; match goal with
adamc@240	437 \| [ \|- context[match ?E with Const _ => _ \| Plus _ _ => _
adamc@240	438 \| Mult _ _ => _ end] ] =>
adamc@240	439 destruct E; crush
adamc@240	440 end.
adamc@240	441
adamc@240	442 induction e.
adamc@240	443
adamc@240	444 (** Since we see the subgoals before any simplification occurs, it is clear that this is the case for constants. [t] makes short work of it. *)
adamc@240	445
adamc@240	446 t.
adamc@240	447
adamc@240	448 (** The next subgoal, for addition, is also discharged without trouble. *)
adamc@240	449
adamc@240	450 t.
adamc@240	451
adamc@240	452 (** The final subgoal is for multiplication, and it is here that we get stuck in the proof state summarized above. *)
adamc@240	453
adamc@240	454 t.
adamc@240	455
adamc@240	456 (** What is [t] doing to get us to this point? The [info] command can help us answer this kind of question. *)
adamc@240	457
adamc@240	458 (** remove printing * *)
adamc@240	459 Undo.
adamc@240	460 info t.
adamc@240	461 (** [[
adamc@240	462 == simpl in ; intuition; subst; autorewrite with cpdt in ;
adamc@240	463 simpl in ; intuition; subst; autorewrite with cpdt in ;
adamc@240	464 simpl in *; intuition; subst; destruct (reassoc e2).
adamc@240	465 simpl in *; intuition.
adamc@240	466
adamc@240	467 simpl in *; intuition.
adamc@240	468
adamc@240	469 simpl in ; intuition; subst; autorewrite with cpdt in ;
adamc@240	470 refine (eq_ind_r
adamc@240	471 (fun n : nat =>
adamc@240	472 n * (eval e3 + 1 - 1) * eval e4 = eval e1 * eval e2) _ IHe1);
adamc@240	473 autorewrite with cpdt in ; simpl in ; intuition;
adamc@240	474 subst; autorewrite with cpdt in ; simpl in ;
adamc@240	475 intuition; subst.
adamc@240	476
adamc@240	477 ]]
adamc@240	478
adamc@240	479 A detailed trace of [t]'s execution appears. Since we are using the very general [crush] tactic, many of these steps have no effect and only occur as instances of a more general strategy. We can copy-and-paste the details to see where things go wrong. *)
adamc@240	480
adamc@240	481 Undo.
adamc@240	482
adamc@240	483 (** We arbitrarily split the script into chunks. The first few seem not to do any harm. *)
adamc@240	484
adamc@240	485 simpl in ; intuition; subst; autorewrite with cpdt in .
adamc@240	486 simpl in ; intuition; subst; autorewrite with cpdt in .
adamc@240	487 simpl in *; intuition; subst; destruct (reassoc e2).
adamc@240	488 simpl in *; intuition.
adamc@240	489 simpl in *; intuition.
adamc@240	490
adamc@240	491 (** The next step is revealed as the culprit, bringing us to the final unproved subgoal. *)
adamc@240	492
adamc@240	493 simpl in ; intuition; subst; autorewrite with cpdt in .
adamc@240	494
adamc@240	495 (** We can split the steps further to assign blame. *)
adamc@240	496
adamc@240	497 Undo.
adamc@240	498
adamc@240	499 simpl in *.
adamc@240	500 intuition.
adamc@240	501 subst.
adamc@240	502 autorewrite with cpdt in *.
adamc@240	503
adamc@240	504 (** It was the final of these four tactics that made the rewrite. We can find out exactly what happened. The [info] command presents hierarchical views of proof steps, and we can zoom down to a lower level of detail by applying [info] to one of the steps that appeared in the original trace. *)
adamc@240	505
adamc@240	506 Undo.
adamc@240	507
adamc@240	508 info autorewrite with cpdt in *.
adamc@240	509 (** [[
adamc@240	510 == refine (eq_ind_r (fun n : nat => n = eval e1 * eval e2) _
adamc@240	511 (confounder (reassoc e1) e3 e4)).
adamc@240	512
adamc@240	513 ]]
adamc@240	514
adamc@240	515 The way a rewrite is displayed is somewhat baroque, but we can see that theorem [confounder] is the final culprit. At this point, we could remove that hint, prove an alternate version of the key lemma [rewr], or come up with some other remedy. Fixing this kind of problem tends to be relatively easy once the problem is revealed. *)
adamc@240	516
adamc@240	517 Abort.
adamc@240	518
adamc@240	519 (** printing * $\times$ *)
adamc@240	520
adamc@241	521 (** Sometimes a change to a development has undesirable performance consequences, even if it does not prevent any old proof scripts from completing. If the performance consequences are severe enough, the proof scripts can be considered broken for practical purposes.
adamc@241	522
adamc@241	523 Here is one example of a performance surprise. *)
adamc@241	524
adamc@239	525 Section slow.
adamc@239	526 Hint Resolve trans_eq.
adamc@239	527
adamc@241	528 (** The central element of the problem is the addition of transitivity as a hint. With transitivity available, it is easy for proof search to wind up exploring exponential search spaces. We also add a few other arbitrary variables and hypotheses, designed to lead to trouble later. *)
adamc@241	529
adamc@239	530 Variable A : Set.
adamc@239	531 Variables P Q R S : A -> A -> Prop.
adamc@239	532 Variable f : A -> A.
adamc@239	533
adamc@239	534 Hypothesis H1 : forall x y, P x y -> Q x y -> R x y -> f x = f y.
adamc@239	535 Hypothesis H2 : forall x y, S x y -> R x y.
adamc@239	536
adamc@241	537 (** We prove a simple lemma very quickly, using the [Time] command to measure exactly how quickly. *)
adamc@241	538
adamc@239	539 Lemma slow : forall x y, P x y -> Q x y -> S x y -> f x = f y.
adamc@241	540 Time eauto 6.
adamc@241	541 (** [[
adamc@241	542 Finished transaction in 0. secs (0.068004u,0.s)
adamc@241	543 ]] *)
adamc@241	544
adamc@239	545 Qed.
adamc@239	546
adamc@241	547 (** Now we add a different hypothesis, which is innocent enough; in fact, it is even provable as a theorem. *)
adamc@241	548
adamc@239	549 Hypothesis H3 : forall x y, x = y -> f x = f y.
adamc@239	550
adamc@239	551 Lemma slow' : forall x y, P x y -> Q x y -> S x y -> f x = f y.
adamc@241	552 Time eauto 6.
adamc@241	553 (** [[
adamc@241	554 Finished transaction in 2. secs (1.264079u,0.s)
adamc@241	555
adamc@241	556 ]]
adamc@241	557
adamc@241	558 Why has the search time gone up so much? The [info] command is not much help, since it only shows the result of search, not all of the paths that turned out to be worthless. *)
adamc@241	559
adamc@241	560 Restart.
adamc@241	561 info eauto 6.
adamc@241	562 (** [[
adamc@241	563 == intro x; intro y; intro H; intro H0; intro H4;
adamc@241	564 simple eapply trans_eq.
adamc@241	565 simple apply refl_equal.
adamc@241	566
adamc@241	567 simple eapply trans_eq.
adamc@241	568 simple apply refl_equal.
adamc@241	569
adamc@241	570 simple eapply trans_eq.
adamc@241	571 simple apply refl_equal.
adamc@241	572
adamc@241	573 simple apply H1.
adamc@241	574 eexact H.
adamc@241	575
adamc@241	576 eexact H0.
adamc@241	577
adamc@241	578 simple apply H2; eexact H4.
adamc@241	579
adamc@241	580 ]]
adamc@241	581
adamc@241	582 This output does not tell us why proof search takes so long, but it does provide a clue that would be useful if we had forgotten that we added transitivity as a hint. The [eauto] tactic is applying depth-first search, and the proof script where the real action is ends up buried inside a chain of pointless invocations of transitivity, where each invocation uses reflexivity to discharge one subgoal. Each increment to the depth argument to [eauto] adds another silly use of transitivity. This wasted proof effort only adds linear time overhead, as long as proof search never makes false steps. No false steps were made before we added the new hypothesis, but somehow the addition made possible a new faulty path. To understand which paths we enabled, we can use the [debug] command. *)
adamc@241	583
adamc@241	584 Restart.
adamc@241	585 debug eauto 6.
adamc@241	586
adamc@241	587 (** The output is a large proof tree. The beginning of the tree is enough to reveal what is happening:
adamc@241	588
adamc@241	589 [[
adamc@241	590 1 depth=6
adamc@241	591 1.1 depth=6 intro
adamc@241	592 1.1.1 depth=6 intro
adamc@241	593 1.1.1.1 depth=6 intro
adamc@241	594 1.1.1.1.1 depth=6 intro
adamc@241	595 1.1.1.1.1.1 depth=6 intro
adamc@241	596 1.1.1.1.1.1.1 depth=5 apply H3
adamc@241	597 1.1.1.1.1.1.1.1 depth=4 eapply trans_eq
adamc@241	598 1.1.1.1.1.1.1.1.1 depth=4 apply refl_equal
adamc@241	599 1.1.1.1.1.1.1.1.1.1 depth=3 eapply trans_eq
adamc@241	600 1.1.1.1.1.1.1.1.1.1.1 depth=3 apply refl_equal
adamc@241	601 1.1.1.1.1.1.1.1.1.1.1.1 depth=2 eapply trans_eq
adamc@241	602 1.1.1.1.1.1.1.1.1.1.1.1.1 depth=2 apply refl_equal
adamc@241	603 1.1.1.1.1.1.1.1.1.1.1.1.1.1 depth=1 eapply trans_eq
adamc@241	604 1.1.1.1.1.1.1.1.1.1.1.1.1.1.1 depth=1 apply refl_equal
adamc@241	605 1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1 depth=0 eapply trans_eq
adamc@241	606 1.1.1.1.1.1.1.1.1.1.1.1.1.1.2 depth=1 apply sym_eq ; trivial
adamc@241	607 1.1.1.1.1.1.1.1.1.1.1.1.1.1.2.1 depth=0 eapply trans_eq
adamc@241	608 1.1.1.1.1.1.1.1.1.1.1.1.1.1.3 depth=0 eapply trans_eq
adamc@241	609 1.1.1.1.1.1.1.1.1.1.1.1.2 depth=2 apply sym_eq ; trivial
adamc@241	610 1.1.1.1.1.1.1.1.1.1.1.1.2.1 depth=1 eapply trans_eq
adamc@241	611 1.1.1.1.1.1.1.1.1.1.1.1.2.1.1 depth=1 apply refl_equal
adamc@241	612 1.1.1.1.1.1.1.1.1.1.1.1.2.1.1.1 depth=0 eapply trans_eq
adamc@241	613 1.1.1.1.1.1.1.1.1.1.1.1.2.1.2 depth=1 apply sym_eq ; trivial
adamc@241	614 1.1.1.1.1.1.1.1.1.1.1.1.2.1.2.1 depth=0 eapply trans_eq
adamc@241	615 1.1.1.1.1.1.1.1.1.1.1.1.2.1.3 depth=0 eapply trans_eq
adamc@241	616
adamc@241	617 ]]
adamc@241	618
adamc@241	619 The first choice [eauto] makes is to apply [H3], since [H3] has the fewest hypotheses of all of the hypotheses and hints that match. However, it turns out that the single hypothesis generated is unprovable. That does not stop [eauto] from trying to prove it with an exponentially-sized tree of applications of transitivity, reflexivity, and symmetry of equality. It is the children of the initial [apply H3] that account for all of the noticeable time in proof execution. In a more realistic development, we might use this output of [info] to realize that adding transitivity as a hint was a bad idea. *)
adamc@241	620
adamc@239	621 Qed.
adamc@239	622 End slow.
adamc@239	623
adamc@241	624 (** It is also easy to end up with a proof script that uses too much memory. As tactics run, they avoid generating proof terms, since serious proof search will consider many possible avenues, and we do not want to built proof terms for subproofs that end up unused. Instead, tactic execution maintains %\textit{%#<i>#thunks#</i>#%}% (suspended computations, represented with closures), such that a tactic's proof-producing thunk is only executed when we run [Qed]. These thunks can use up large amounts of space, such that a proof script exhausts available memory, even when we know that we could have used much less memory by forcing some thunks earlier.
adamc@241	625
adamc@241	626 The [abstract] tactical helps us force thunks by proving some subgoals as their own lemmas. For instance, a proof [induction x; crush] can in many cases be made to use significantly less peak memory by changing it to [induction x; abstract crush]. The main limitation of [abstract] is that it can only be applied to subgoals that are proved completely, with no undetermined unification variables remaining. Still, many large automated proofs can realize vast memory savings via [abstract]. *)
adamc@241	627
adamc@238	628
adamc@235	629 (** * Modules *)
adamc@235	630
adamc@235	631 Module Type GROUP.
adamc@235	632 Parameter G : Set.
adamc@235	633 Parameter f : G -> G -> G.
adamc@235	634 Parameter e : G.
adamc@235	635 Parameter i : G -> G.
adamc@235	636
adamc@235	637 Axiom assoc : forall a b c, f (f a b) c = f a (f b c).
adamc@235	638 Axiom ident : forall a, f e a = a.
adamc@235	639 Axiom inverse : forall a, f (i a) a = e.
adamc@235	640 End GROUP.
adamc@235	641
adamc@235	642 Module Type GROUP_THEOREMS.
adamc@235	643 Declare Module M : GROUP.
adamc@235	644
adamc@235	645 Axiom ident' : forall a, M.f a M.e = a.
adamc@235	646
adamc@235	647 Axiom inverse' : forall a, M.f a (M.i a) = M.e.
adamc@235	648
adamc@235	649 Axiom unique_ident : forall e', (forall a, M.f e' a = a) -> e' = M.e.
adamc@235	650 End GROUP_THEOREMS.
adamc@235	651
adamc@239	652 Module Group (M : GROUP) : GROUP_THEOREMS with Module M := M.
adamc@235	653 Module M := M.
adamc@235	654
adamc@235	655 Import M.
adamc@235	656
adamc@235	657 Theorem inverse' : forall a, f a (i a) = e.
adamc@235	658 intro.
adamc@235	659 rewrite <- (ident (f a (i a))).
adamc@235	660 rewrite <- (inverse (f a (i a))) at 1.
adamc@235	661 rewrite assoc.
adamc@235	662 rewrite assoc.
adamc@235	663 rewrite <- (assoc (i a) a (i a)).
adamc@235	664 rewrite inverse.
adamc@235	665 rewrite ident.
adamc@235	666 apply inverse.
adamc@235	667 Qed.
adamc@235	668
adamc@235	669 Theorem ident' : forall a, f a e = a.
adamc@235	670 intro.
adamc@235	671 rewrite <- (inverse a).
adamc@235	672 rewrite <- assoc.
adamc@235	673 rewrite inverse'.
adamc@235	674 apply ident.
adamc@235	675 Qed.
adamc@235	676
adamc@235	677 Theorem unique_ident : forall e', (forall a, M.f e' a = a) -> e' = M.e.
adamc@235	678 intros.
adamc@235	679 rewrite <- (H e).
adamc@235	680 symmetry.
adamc@235	681 apply ident'.
adamc@235	682 Qed.
adamc@235	683 End Group.
adamc@239	684
adamc@239	685 Require Import ZArith.
adamc@239	686 Open Scope Z_scope.
adamc@239	687
adamc@239	688 Module Int.
adamc@239	689 Definition G := Z.
adamc@239	690 Definition f x y := x + y.
adamc@239	691 Definition e := 0.
adamc@239	692 Definition i x := -x.
adamc@239	693
adamc@239	694 Theorem assoc : forall a b c, f (f a b) c = f a (f b c).
adamc@239	695 unfold f; crush.
adamc@239	696 Qed.
adamc@239	697 Theorem ident : forall a, f e a = a.
adamc@239	698 unfold f, e; crush.
adamc@239	699 Qed.
adamc@239	700 Theorem inverse : forall a, f (i a) a = e.
adamc@239	701 unfold f, i, e; crush.
adamc@239	702 Qed.
adamc@239	703 End Int.
adamc@239	704
adamc@239	705 Module IntTheorems := Group(Int).
adamc@239	706
adamc@239	707 Check IntTheorems.unique_ident.
adamc@239	708
adamc@239	709 Theorem unique_ident : forall e', (forall a, e' + a = a) -> e' = 0.
adamc@239	710 exact IntTheorems.unique_ident.
adamc@239	711 Qed.

Mercurial > cpdt > repo

annotate src/Large.v @ 241:cb3f3ef9d5bb