annotate src/MoreDep.v @ 100:070f4de92311

Templatize MoreDep
author Adam Chlipala <adamc@hcoop.net>
date Wed, 08 Oct 2008 10:01:26 -0400
parents 696726af9530
children bc12662ae895
rev   line source
adamc@83 1 (* Copyright (c) 2008, Adam Chlipala
adamc@83 2 *
adamc@83 3 * This work is licensed under a
adamc@83 4 * Creative Commons Attribution-Noncommercial-No Derivative Works 3.0
adamc@83 5 * Unported License.
adamc@83 6 * The license text is available at:
adamc@83 7 * http://creativecommons.org/licenses/by-nc-nd/3.0/
adamc@83 8 *)
adamc@83 9
adamc@83 10 (* begin hide *)
adamc@85 11 Require Import Arith Bool List.
adamc@83 12
adamc@86 13 Require Import Tactics MoreSpecif.
adamc@83 14
adamc@83 15 Set Implicit Arguments.
adamc@83 16 (* end hide *)
adamc@83 17
adamc@83 18
adamc@83 19 (** %\chapter{More Dependent Types}% *)
adamc@83 20
adamc@83 21 (** Subset types and their relatives help us integrate verification with programming. Though they reorganize the certified programmer's workflow, they tend not to have deep effects on proofs. We write largely the same proofs as we would for classical verification, with some of the structure moved into the programs themselves. It turns out that, when we use dependent types to their full potential, we warp the development and proving process even more than that, picking up "free theorems" to the extent that often a certified program is hardly more complex than its uncertified counterpart in Haskell or ML.
adamc@83 22
adamc@83 23 In particular, we have only scratched the tip of the iceberg that is Coq's inductive definition mechanism. The inductive types we have seen so far have their counterparts in the other proof assistants that we surveyed in Chapter 1. This chapter explores the strange new world of dependent inductive datatypes (that is, dependent inductive types outside [Prop]), a possibility which sets Coq apart from all of the competition not based on type theory. *)
adamc@83 24
adamc@84 25
adamc@84 26 (** * Length-Indexed Lists *)
adamc@84 27
adamc@84 28 (** Many introductions to dependent types start out by showing how to use them to eliminate array bounds checks. When the type of an array tells you how many elements it has, your compiler can detect out-of-bounds dereferences statically. Since we are working in a pure functional language, the next best thing is length-indexed lists, which the following code defines. *)
adamc@84 29
adamc@84 30 Section ilist.
adamc@84 31 Variable A : Set.
adamc@84 32
adamc@84 33 Inductive ilist : nat -> Set :=
adamc@84 34 | Nil : ilist O
adamc@84 35 | Cons : forall n, A -> ilist n -> ilist (S n).
adamc@84 36
adamc@84 37 (** We see that, within its section, [ilist] is given type [nat -> Set]. Previously, every inductive type we have seen has either had plain [Set] as its type or has been a predicate with some type ending in [Prop]. The full generality of inductive definitions lets us integrate the expressivity of predicates directly into our normal programming.
adamc@84 38
adamc@84 39 The [nat] argument to [ilist] tells us the length of the list. The types of [ilist]'s constructors tell us that a [Nil] list has length [O] and that a [Cons] list has length one greater than the length of its sublist. We may apply [ilist] to any natural number, even natural numbers that are only known at runtime. It is this breaking of the %\textit{%#<i>#phase distinction#</i>#%}% that characterizes [ilist] as %\textit{%#<i>#dependently typed#</i>#%}%.
adamc@84 40
adamc@84 41 In expositions of list types, we usually see the length function defined first, but here that would not be a very productive function to code. Instead, let us implement list concatenation.
adamc@84 42
adamc@84 43 [[
adamc@84 44 Fixpoint app n1 (ls1 : ilist n1) n2 (ls2 : ilist n2) {struct ls1} : ilist (n1 + n2) :=
adamc@84 45 match ls1 with
adamc@84 46 | Nil => ls2
adamc@84 47 | Cons _ x ls1' => Cons x (app ls1' ls2)
adamc@84 48 end.
adamc@84 49
adamc@84 50 Coq is not happy with this definition:
adamc@84 51
adamc@84 52 [[
adamc@84 53 The term "ls2" has type "ilist n2" while it is expected to have type
adamc@84 54 "ilist (?14 + n2)"
adamc@84 55 ]]
adamc@84 56
adamc@84 57 We see the return of a problem we have considered before. Without explicit annotations, Coq does not enrich our typing assumptions in the branches of a [match] expression. It is clear that the unification variable [?14] should be resolved to 0 in this context, so that we have [0 + n2] reducing to [n2], but Coq does not realize that. We cannot fix the problem using just the simple [return] clauses we applied in the last chapter. We need to combine a [return] clause with a new kind of annotation, an [in] clause. *)
adamc@84 58
adamc@100 59 (* EX: Implement concatenation *)
adamc@100 60
adamc@100 61 (* begin thide *)
adamc@84 62 Fixpoint app n1 (ls1 : ilist n1) n2 (ls2 : ilist n2) {struct ls1} : ilist (n1 + n2) :=
adamc@84 63 match ls1 in (ilist n1) return (ilist (n1 + n2)) with
adamc@84 64 | Nil => ls2
adamc@84 65 | Cons _ x ls1' => Cons x (app ls1' ls2)
adamc@84 66 end.
adamc@100 67 (* end thide *)
adamc@84 68
adamc@84 69 (** This version of [app] passes the type checker. Using [return] alone allowed us to express a dependency of the [match] result type on the %\textit{%#<i>#value#</i>#%}% of the discriminee. What [in] adds to our arsenal is a way of expressing a dependency on the %\textit{%#<i>#type#</i>#%}% of the discriminee. Specifically, the [n1] in the [in] clause above is a %\textit{%#<i>#binding occurrence#</i>#%}% whose scope is the [return] clause.
adamc@84 70
adamc@84 71 We may use [in] clauses only to bind names for the arguments of an inductive type family. That is, each [in] clause must be an inductive type family name applied to a sequence of underscores and variable names of the proper length. The positions for %\textit{%#<i>#parameters#</i>#%}% to the type family must all be underscores. Parameters are those arguments declared with section variables or with entries to the left of the first colon in an inductive definition. They cannot vary depending on which constructor was used to build the discriminee, so Coq prohibits pointless matches on them. It is those arguments defined in the type to the right of the colon that we may name with [in] clauses.
adamc@84 72
adamc@84 73 Our [app] function could be typed in so-called %\textit{%#<i>#stratified#</i>#%}% type systems, which avoid true dependency. We could consider the length indices to lists to live in a separate, compile-time-only universe from the lists themselves. Our next example would be harder to implement in a stratified system. We write an injection function from regular lists to length-indexed lists. A stratified implementation would need to duplicate the definition of lists across compile-time and run-time versions, and the run-time versions would need to be indexed by the compile-time versions. *)
adamc@84 74
adamc@100 75 (* EX: Implement injection from normal lists *)
adamc@100 76
adamc@100 77 (* begin thide *)
adamc@84 78 Fixpoint inject (ls : list A) : ilist (length ls) :=
adamc@84 79 match ls return (ilist (length ls)) with
adamc@84 80 | nil => Nil
adamc@84 81 | h :: t => Cons h (inject t)
adamc@84 82 end.
adamc@84 83
adamc@84 84 (** We can define an inverse conversion and prove that it really is an inverse. *)
adamc@84 85
adamc@84 86 Fixpoint unject n (ls : ilist n) {struct ls} : list A :=
adamc@84 87 match ls with
adamc@84 88 | Nil => nil
adamc@84 89 | Cons _ h t => h :: unject t
adamc@84 90 end.
adamc@84 91
adamc@84 92 Theorem inject_inverse : forall ls, unject (inject ls) = ls.
adamc@84 93 induction ls; crush.
adamc@84 94 Qed.
adamc@100 95 (* end thide *)
adamc@100 96
adamc@100 97 (* EX: Implement statically-checked "car"/"hd" *)
adamc@84 98
adamc@84 99 (** Now let us attempt a function that is surprisingly tricky to write. In ML, the list head function raises an exception when passed an empty list. With length-indexed lists, we can rule out such invalid calls statically, and here is a first attempt at doing so.
adamc@84 100
adamc@84 101 [[
adamc@84 102 Definition hd n (ls : ilist (S n)) : A :=
adamc@84 103 match ls with
adamc@84 104 | Nil => ???
adamc@84 105 | Cons _ h _ => h
adamc@84 106 end.
adamc@84 107
adamc@84 108 It is not clear what to write for the [Nil] case, so we are stuck before we even turn our function over to the type checker. We could try omitting the [Nil] case:
adamc@84 109
adamc@84 110 [[
adamc@84 111 Definition hd n (ls : ilist (S n)) : A :=
adamc@84 112 match ls with
adamc@84 113 | Cons _ h _ => h
adamc@84 114 end.
adamc@84 115
adamc@84 116 [[
adamc@84 117 Error: Non exhaustive pattern-matching: no clause found for pattern Nil
adamc@84 118 ]]
adamc@84 119
adamc@84 120 Unlike in ML, we cannot use inexhaustive pattern matching, becuase there is no conception of a %\texttt{%#<tt>#Match#</tt>#%}% exception to be thrown. We might try using an [in] clause somehow.
adamc@84 121
adamc@84 122 [[
adamc@84 123 Definition hd n (ls : ilist (S n)) : A :=
adamc@84 124 match ls in (ilist (S n)) with
adamc@84 125 | Cons _ h _ => h
adamc@84 126 end.
adamc@84 127
adamc@84 128 [[
adamc@84 129 Error: The reference n was not found in the current environment
adamc@84 130 ]]
adamc@84 131
adamc@84 132 In this and other cases, we feel like we want [in] clauses with type family arguments that are not variables. Unfortunately, Coq only supports variables in those positions. A completely general mechanism could only be supported with a solution to the problem of higher-order unification, which is undecidable. There %\textit{%#<i>#are#</i>#%}% useful heuristics for handling non-variable indices which are gradually making their way into Coq, but we will spend some time in this and the next few chapters on effective pattern matching on dependent types using only the primitive [match] annotations.
adamc@84 133
adamc@84 134 Our final, working attempt at [hd] uses an auxiliary function and a surprising [return] annotation. *)
adamc@84 135
adamc@100 136 (* begin thide *)
adamc@84 137 Definition hd' n (ls : ilist n) :=
adamc@84 138 match ls in (ilist n) return (match n with O => unit | S _ => A end) with
adamc@84 139 | Nil => tt
adamc@84 140 | Cons _ h _ => h
adamc@84 141 end.
adamc@84 142
adamc@84 143 Definition hd n (ls : ilist (S n)) : A := hd' ls.
adamc@100 144 (* end thide *)
adamc@84 145
adamc@84 146 (** We annotate our main [match] with a type that is itself a [match]. We write that the function [hd'] returns [unit] when the list is empty and returns the carried type [A] in all other cases. In the definition of [hd], we just call [hd']. Because the index of [ls] is known to be nonzero, the type checker reduces the [match] in the type of [hd'] to [A]. *)
adamc@84 147
adamc@84 148 End ilist.
adamc@85 149
adamc@85 150
adamc@85 151 (** * A Tagless Interpreter *)
adamc@85 152
adamc@85 153 (** A favorite example for motivating the power of functional programming is implementation of a simple expression language interpreter. In ML and Haskell, such interpreters are often implemented using an algebraic datatype of values, where at many points it is checked that a value was built with the right constructor of the value type. With dependent types, we can implement a %\textit{%#<i>#tagless#</i>#%}% interpreter that both removes this source of runtime ineffiency and gives us more confidence that our implementation is correct. *)
adamc@85 154
adamc@85 155 Inductive type : Set :=
adamc@85 156 | Nat : type
adamc@85 157 | Bool : type
adamc@85 158 | Prod : type -> type -> type.
adamc@85 159
adamc@85 160 Inductive exp : type -> Set :=
adamc@85 161 | NConst : nat -> exp Nat
adamc@85 162 | Plus : exp Nat -> exp Nat -> exp Nat
adamc@85 163 | Eq : exp Nat -> exp Nat -> exp Bool
adamc@85 164
adamc@85 165 | BConst : bool -> exp Bool
adamc@85 166 | And : exp Bool -> exp Bool -> exp Bool
adamc@85 167 | If : forall t, exp Bool -> exp t -> exp t -> exp t
adamc@85 168
adamc@85 169 | Pair : forall t1 t2, exp t1 -> exp t2 -> exp (Prod t1 t2)
adamc@85 170 | Fst : forall t1 t2, exp (Prod t1 t2) -> exp t1
adamc@85 171 | Snd : forall t1 t2, exp (Prod t1 t2) -> exp t2.
adamc@85 172
adamc@85 173 (** We have a standard algebraic datatype [type], defining a type language of naturals, booleans, and product (pair) types. Then we have the indexed inductive type [exp], where the argument to [exp] tells us the encoded type of an expression. In effect, we are defining the typing rules for expressions simultaneously with the syntax.
adamc@85 174
adamc@85 175 We can give types and expressions semantics in a new style, based critically on the chance for %\textit{%#<i>#type-level computation#</i>#%}%. *)
adamc@85 176
adamc@85 177 Fixpoint typeDenote (t : type) : Set :=
adamc@85 178 match t with
adamc@85 179 | Nat => nat
adamc@85 180 | Bool => bool
adamc@85 181 | Prod t1 t2 => typeDenote t1 * typeDenote t2
adamc@85 182 end%type.
adamc@85 183
adamc@85 184 (** [typeDenote] compiles types of our object language into "native" Coq types. It is deceptively easy to implement. The only new thing we see is the [%type] annotation, which tells Coq to parse the [match] expression using the notations associated with types. Without this annotation, the [*] would be interpreted as multiplication on naturals, rather than as the product type constructor. [type] is one example of an identifer bound to a %\textit{%#<i>#notation scope#</i>#%}%. We will deal more explicitly with notations and notation scopes in later chapters.
adamc@85 185
adamc@85 186 We can define a function [expDenote] that is typed in terms of [typeDenote]. *)
adamc@85 187
adamc@85 188 Fixpoint expDenote t (e : exp t) {struct e} : typeDenote t :=
adamc@85 189 match e in (exp t) return (typeDenote t) with
adamc@85 190 | NConst n => n
adamc@85 191 | Plus e1 e2 => expDenote e1 + expDenote e2
adamc@85 192 | Eq e1 e2 => if eq_nat_dec (expDenote e1) (expDenote e2) then true else false
adamc@85 193
adamc@85 194 | BConst b => b
adamc@85 195 | And e1 e2 => expDenote e1 && expDenote e2
adamc@85 196 | If _ e' e1 e2 => if expDenote e' then expDenote e1 else expDenote e2
adamc@85 197
adamc@85 198 | Pair _ _ e1 e2 => (expDenote e1, expDenote e2)
adamc@85 199 | Fst _ _ e' => fst (expDenote e')
adamc@85 200 | Snd _ _ e' => snd (expDenote e')
adamc@85 201 end.
adamc@85 202
adamc@85 203 (** Again we find that an [in] annotation is essential for type-checking a function. Besides that, the definition is routine. In fact, it is less complicated than what we would write in ML or Haskell 98, since we do not need to worry about pushing final values in and out of an algebraic datatype. The only unusual thing is the use of an expression of the form [if E then true else false] in the [Eq] case. Remember that [eq_nat_dec] has a rich dependent type, rather than a simple boolean type. Coq's native [if] is overloaded to work on a test of any two-constructor type, so we can use [if] to build a simple boolean from the [sumbool] that [eq_nat_dec] returns.
adamc@85 204
adamc@85 205 We can implement our old favorite, a constant folding function, and prove it correct. It will be useful to write a function [pairOut] that checks if an [exp] of [Prod] type is a pair, returning its two components if so. Unsurprisingly, a first attempt leads to a type error.
adamc@85 206
adamc@85 207 [[
adamc@85 208 Definition pairOut t1 t2 (e : exp (Prod t1 t2)) : option (exp t1 * exp t2) :=
adamc@85 209 match e in (exp (Prod t1 t2)) return option (exp t1 * exp t2) with
adamc@85 210 | Pair _ _ e1 e2 => Some (e1, e2)
adamc@85 211 | _ => None
adamc@85 212 end.
adamc@85 213
adamc@85 214 [[
adamc@85 215 Error: The reference t2 was not found in the current environment
adamc@85 216 ]]
adamc@85 217
adamc@85 218 We run again into the problem of not being able to specify non-variable arguments in [in] clauses. The problem would just be hopeless without a use of an [in] clause, though, since the result type of the [match] depends on an argument to [exp]. Our solution will be to use a more general type, as we did for [hd]. First, we define a type-valued function to use in assigning a type to [pairOut]. *)
adamc@85 219
adamc@100 220 (* EX: Define a function [pairOut : forall t1 t2, exp (Prod t1 t2) -> option (exp t1 * exp t2)] *)
adamc@100 221
adamc@100 222 (* begin thide *)
adamc@85 223 Definition pairOutType (t : type) :=
adamc@85 224 match t with
adamc@85 225 | Prod t1 t2 => option (exp t1 * exp t2)
adamc@85 226 | _ => unit
adamc@85 227 end.
adamc@85 228
adamc@85 229 (** When passed a type that is a product, [pairOutType] returns our final desired type. On any other input type, [pairOutType] returns [unit], since we do not care about extracting components of non-pairs. Now we can write another helper function to provide the default behavior of [pairOut], which we will apply for inputs that are not literal pairs. *)
adamc@85 230
adamc@85 231 Definition pairOutDefault (t : type) :=
adamc@85 232 match t return (pairOutType t) with
adamc@85 233 | Prod _ _ => None
adamc@85 234 | _ => tt
adamc@85 235 end.
adamc@85 236
adamc@85 237 (** Now [pairOut] is deceptively easy to write. *)
adamc@85 238
adamc@85 239 Definition pairOut t (e : exp t) :=
adamc@85 240 match e in (exp t) return (pairOutType t) with
adamc@85 241 | Pair _ _ e1 e2 => Some (e1, e2)
adamc@85 242 | _ => pairOutDefault _
adamc@85 243 end.
adamc@100 244 (* end thide *)
adamc@85 245
adamc@85 246 (** There is one important subtlety in this definition. Coq allows us to use convenient ML-style pattern matching notation, but, internally and in proofs, we see that patterns are expanded out completely, matching one level of inductive structure at a time. Thus, the default case in the [match] above expands out to one case for each constructor of [exp] besides [Pair], and the underscore in [pairOutDefault _] is resolved differently in each case. From an ML or Haskell programmer's perspective, what we have here is type inference determining which code is run (returning either [None] or [tt]), which goes beyond what is possible with type inference guiding parametric polymorphism in Hindley-Milner languages, but is similar to what goes on with Haskell type classes.
adamc@85 247
adamc@85 248 With [pairOut] available, we can write [cfold] in a straightforward way. There are really no surprises beyond that Coq verifies that this code has such an expressive type, given the small annotation burden. *)
adamc@85 249
adamc@85 250 Fixpoint cfold t (e : exp t) {struct e} : exp t :=
adamc@85 251 match e in (exp t) return (exp t) with
adamc@85 252 | NConst n => NConst n
adamc@85 253 | Plus e1 e2 =>
adamc@85 254 let e1' := cfold e1 in
adamc@85 255 let e2' := cfold e2 in
adamc@85 256 match e1', e2' with
adamc@85 257 | NConst n1, NConst n2 => NConst (n1 + n2)
adamc@85 258 | _, _ => Plus e1' e2'
adamc@85 259 end
adamc@85 260 | Eq e1 e2 =>
adamc@85 261 let e1' := cfold e1 in
adamc@85 262 let e2' := cfold e2 in
adamc@85 263 match e1', e2' with
adamc@85 264 | NConst n1, NConst n2 => BConst (if eq_nat_dec n1 n2 then true else false)
adamc@85 265 | _, _ => Eq e1' e2'
adamc@85 266 end
adamc@85 267
adamc@85 268 | BConst b => BConst b
adamc@85 269 | And e1 e2 =>
adamc@85 270 let e1' := cfold e1 in
adamc@85 271 let e2' := cfold e2 in
adamc@85 272 match e1', e2' with
adamc@85 273 | BConst b1, BConst b2 => BConst (b1 && b2)
adamc@85 274 | _, _ => And e1' e2'
adamc@85 275 end
adamc@85 276 | If _ e e1 e2 =>
adamc@85 277 let e' := cfold e in
adamc@85 278 match e' with
adamc@85 279 | BConst true => cfold e1
adamc@85 280 | BConst false => cfold e2
adamc@85 281 | _ => If e' (cfold e1) (cfold e2)
adamc@85 282 end
adamc@85 283
adamc@85 284 | Pair _ _ e1 e2 => Pair (cfold e1) (cfold e2)
adamc@85 285 | Fst _ _ e =>
adamc@85 286 let e' := cfold e in
adamc@85 287 match pairOut e' with
adamc@85 288 | Some p => fst p
adamc@85 289 | None => Fst e'
adamc@85 290 end
adamc@85 291 | Snd _ _ e =>
adamc@85 292 let e' := cfold e in
adamc@85 293 match pairOut e' with
adamc@85 294 | Some p => snd p
adamc@85 295 | None => Snd e'
adamc@85 296 end
adamc@85 297 end.
adamc@85 298
adamc@85 299 (** The correctness theorem for [cfold] turns out to be easy to prove, once we get over one serious hurdle. *)
adamc@85 300
adamc@85 301 Theorem cfold_correct : forall t (e : exp t), expDenote e = expDenote (cfold e).
adamc@100 302 (* begin thide *)
adamc@85 303 induction e; crush.
adamc@85 304
adamc@85 305 (** The first remaining subgoal is:
adamc@85 306
adamc@85 307 [[
adamc@85 308
adamc@85 309 expDenote (cfold e1) + expDenote (cfold e2) =
adamc@85 310 expDenote
adamc@85 311 match cfold e1 with
adamc@85 312 | NConst n1 =>
adamc@85 313 match cfold e2 with
adamc@85 314 | NConst n2 => NConst (n1 + n2)
adamc@85 315 | Plus _ _ => Plus (cfold e1) (cfold e2)
adamc@85 316 | Eq _ _ => Plus (cfold e1) (cfold e2)
adamc@85 317 | BConst _ => Plus (cfold e1) (cfold e2)
adamc@85 318 | And _ _ => Plus (cfold e1) (cfold e2)
adamc@85 319 | If _ _ _ _ => Plus (cfold e1) (cfold e2)
adamc@85 320 | Pair _ _ _ _ => Plus (cfold e1) (cfold e2)
adamc@85 321 | Fst _ _ _ => Plus (cfold e1) (cfold e2)
adamc@85 322 | Snd _ _ _ => Plus (cfold e1) (cfold e2)
adamc@85 323 end
adamc@85 324 | Plus _ _ => Plus (cfold e1) (cfold e2)
adamc@85 325 | Eq _ _ => Plus (cfold e1) (cfold e2)
adamc@85 326 | BConst _ => Plus (cfold e1) (cfold e2)
adamc@85 327 | And _ _ => Plus (cfold e1) (cfold e2)
adamc@85 328 | If _ _ _ _ => Plus (cfold e1) (cfold e2)
adamc@85 329 | Pair _ _ _ _ => Plus (cfold e1) (cfold e2)
adamc@85 330 | Fst _ _ _ => Plus (cfold e1) (cfold e2)
adamc@85 331 | Snd _ _ _ => Plus (cfold e1) (cfold e2)
adamc@85 332 end
adamc@85 333 ]]
adamc@85 334
adamc@85 335 We would like to do a case analysis on [cfold e1], and we attempt that in the way that has worked so far.
adamc@85 336
adamc@85 337 [[
adamc@85 338 destruct (cfold e1).
adamc@85 339
adamc@85 340 [[
adamc@85 341 User error: e1 is used in hypothesis e
adamc@85 342 ]]
adamc@85 343
adamc@85 344 Coq gives us another cryptic error message. Like so many others, this one basically means that Coq is not able to build some proof about dependent types. It is hard to generate helpful and specific error messages for problems like this, since that would require some kind of understanding of the dependency structure of a piece of code. We will encounter many examples of case-specific tricks for recovering from errors like this one.
adamc@85 345
adamc@85 346 For our current proof, we can use a tactic [dep_destruct] defined in the book [Tactics] module. General elimination/inversion of dependently-typed hypotheses is undecidable, since it must be implemented with [match] expressions that have the restriction on [in] clauses that we have already discussed. [dep_destruct] makes a best effort to handle some common cases. In a future chapter, we will learn about the explicit manipulation of equality proofs that is behind [dep_destruct]'s implementation in Ltac, but for now, we treat it as a useful black box. *)
adamc@85 347
adamc@85 348 dep_destruct (cfold e1).
adamc@85 349
adamc@85 350 (** This successfully breaks the subgoal into 5 new subgoals, one for each constructor of [exp] that could produce an [exp Nat]. Note that [dep_destruct] is successful in ruling out the other cases automatically, in effect automating some of the work that we have done manually in implementing functions like [hd] and [pairOut].
adamc@85 351
adamc@85 352 This is the only new trick we need to learn to complete the proof. We can back up and give a short, automated proof. *)
adamc@85 353
adamc@85 354 Restart.
adamc@85 355
adamc@85 356 induction e; crush;
adamc@85 357 repeat (match goal with
adamc@85 358 | [ |- context[cfold ?E] ] => dep_destruct (cfold E)
adamc@85 359 | [ |- (if ?E then _ else _) = _ ] => destruct E
adamc@85 360 end; crush).
adamc@85 361 Qed.
adamc@100 362 (* end thide *)
adamc@86 363
adamc@86 364
adamc@94 365 (** Dependently-Typed Red-Black Trees *)
adamc@94 366
adamc@100 367 (** TODO: Add commentary for this section. *)
adamc@100 368
adamc@94 369 Inductive color : Set := Red | Black.
adamc@94 370
adamc@94 371 Inductive rbtree : color -> nat -> Set :=
adamc@94 372 | Leaf : rbtree Black 0
adamc@94 373 | RedNode : forall n, rbtree Black n -> nat-> rbtree Black n -> rbtree Red n
adamc@94 374 | BlackNode : forall c1 c2 n, rbtree c1 n -> nat -> rbtree c2 n -> rbtree Black (S n).
adamc@94 375
adamc@100 376 (* EX: Prove that every [rbtree] is balanced. *)
adamc@100 377
adamc@100 378 (* begin thide *)
adamc@95 379 Require Import Max Min.
adamc@95 380
adamc@95 381 Section depth.
adamc@95 382 Variable f : nat -> nat -> nat.
adamc@95 383
adamc@95 384 Fixpoint depth c n (t : rbtree c n) {struct t} : nat :=
adamc@95 385 match t with
adamc@95 386 | Leaf => 0
adamc@95 387 | RedNode _ t1 _ t2 => S (f (depth t1) (depth t2))
adamc@95 388 | BlackNode _ _ _ t1 _ t2 => S (f (depth t1) (depth t2))
adamc@95 389 end.
adamc@95 390 End depth.
adamc@95 391
adamc@95 392 Theorem depth_min : forall c n (t : rbtree c n), depth min t >= n.
adamc@95 393 induction t; crush;
adamc@95 394 match goal with
adamc@95 395 | [ |- context[min ?X ?Y] ] => destruct (min_dec X Y)
adamc@95 396 end; crush.
adamc@95 397 Qed.
adamc@95 398
adamc@95 399 Lemma depth_max' : forall c n (t : rbtree c n), match c with
adamc@95 400 | Red => depth max t <= 2 * n + 1
adamc@95 401 | Black => depth max t <= 2 * n
adamc@95 402 end.
adamc@95 403 induction t; crush;
adamc@95 404 match goal with
adamc@95 405 | [ |- context[max ?X ?Y] ] => destruct (max_dec X Y)
adamc@100 406 end; crush;
adamc@100 407 repeat (match goal with
adamc@100 408 | [ H : context[match ?C with Red => _ | Black => _ end] |- _ ] => destruct C
adamc@100 409 end; crush).
adamc@95 410 Qed.
adamc@95 411
adamc@95 412 Theorem depth_max : forall c n (t : rbtree c n), depth max t <= 2 * n + 1.
adamc@95 413 intros; generalize (depth_max' t); destruct c; crush.
adamc@95 414 Qed.
adamc@95 415
adamc@95 416 Theorem balanced : forall c n (t : rbtree c n), 2 * depth min t + 1 >= depth max t.
adamc@95 417 intros; generalize (depth_min t); generalize (depth_max t); crush.
adamc@95 418 Qed.
adamc@100 419 (* end thide *)
adamc@95 420
adamc@95 421
adamc@94 422 Inductive rtree : nat -> Set :=
adamc@94 423 | RedNode' : forall c1 c2 n, rbtree c1 n -> nat -> rbtree c2 n -> rtree n.
adamc@94 424
adamc@96 425 Section present.
adamc@96 426 Variable x : nat.
adamc@96 427
adamc@96 428 Fixpoint present c n (t : rbtree c n) {struct t} : Prop :=
adamc@96 429 match t with
adamc@96 430 | Leaf => False
adamc@96 431 | RedNode _ a y b => present a \/ x = y \/ present b
adamc@96 432 | BlackNode _ _ _ a y b => present a \/ x = y \/ present b
adamc@96 433 end.
adamc@96 434
adamc@96 435 Definition rpresent n (t : rtree n) : Prop :=
adamc@96 436 match t with
adamc@96 437 | RedNode' _ _ _ a y b => present a \/ x = y \/ present b
adamc@96 438 end.
adamc@96 439 End present.
adamc@96 440
adamc@100 441 Locate "{ _ : _ & _ }".
adamc@100 442 Print sigT.
adamc@100 443
adamc@94 444 Notation "{< x >}" := (existT _ _ x).
adamc@94 445
adamc@94 446 Definition balance1 n (a : rtree n) (data : nat) c2 :=
adamc@94 447 match a in rtree n return rbtree c2 n -> { c : color & rbtree c (S n) } with
adamc@94 448 | RedNode' _ _ _ t1 y t2 =>
adamc@94 449 match t1 in rbtree c n return rbtree _ n -> rbtree c2 n -> { c : color & rbtree c (S n) } with
adamc@94 450 | RedNode _ a x b => fun c d => {<RedNode (BlackNode a x b) y (BlackNode c data d)>}
adamc@94 451 | t1' => fun t2 =>
adamc@94 452 match t2 in rbtree c n return rbtree _ n -> rbtree c2 n -> { c : color & rbtree c (S n) } with
adamc@94 453 | RedNode _ b x c => fun a d => {<RedNode (BlackNode a y b) x (BlackNode c data d)>}
adamc@95 454 | b => fun a t => {<BlackNode (RedNode a y b) data t>}
adamc@94 455 end t1'
adamc@94 456 end t2
adamc@94 457 end.
adamc@94 458
adamc@94 459 Definition balance2 n (a : rtree n) (data : nat) c2 :=
adamc@94 460 match a in rtree n return rbtree c2 n -> { c : color & rbtree c (S n) } with
adamc@94 461 | RedNode' _ _ _ t1 z t2 =>
adamc@94 462 match t1 in rbtree c n return rbtree _ n -> rbtree c2 n -> { c : color & rbtree c (S n) } with
adamc@94 463 | RedNode _ b y c => fun d a => {<RedNode (BlackNode a data b) y (BlackNode c z d)>}
adamc@94 464 | t1' => fun t2 =>
adamc@94 465 match t2 in rbtree c n return rbtree _ n -> rbtree c2 n -> { c : color & rbtree c (S n) } with
adamc@94 466 | RedNode _ c z' d => fun b a => {<RedNode (BlackNode a data b) z (BlackNode c z' d)>}
adamc@95 467 | b => fun a t => {<BlackNode t data (RedNode a z b)>}
adamc@94 468 end t1'
adamc@94 469 end t2
adamc@94 470 end.
adamc@94 471
adamc@94 472 Section insert.
adamc@94 473 Variable x : nat.
adamc@94 474
adamc@94 475 Definition insResult c n :=
adamc@94 476 match c with
adamc@94 477 | Red => rtree n
adamc@94 478 | Black => { c' : color & rbtree c' n }
adamc@94 479 end.
adamc@94 480
adamc@94 481 Fixpoint ins c n (t : rbtree c n) {struct t} : insResult c n :=
adamc@94 482 match t in (rbtree c n) return (insResult c n) with
adamc@94 483 | Leaf => {< RedNode Leaf x Leaf >}
adamc@94 484 | RedNode _ a y b =>
adamc@94 485 if le_lt_dec x y
adamc@94 486 then RedNode' (projT2 (ins a)) y b
adamc@94 487 else RedNode' a y (projT2 (ins b))
adamc@94 488 | BlackNode c1 c2 _ a y b =>
adamc@94 489 if le_lt_dec x y
adamc@94 490 then
adamc@94 491 match c1 return insResult c1 _ -> _ with
adamc@94 492 | Red => fun ins_a => balance1 ins_a y b
adamc@94 493 | _ => fun ins_a => {< BlackNode (projT2 ins_a) y b >}
adamc@94 494 end (ins a)
adamc@94 495 else
adamc@94 496 match c2 return insResult c2 _ -> _ with
adamc@94 497 | Red => fun ins_b => balance2 ins_b y a
adamc@94 498 | _ => fun ins_b => {< BlackNode a y (projT2 ins_b) >}
adamc@94 499 end (ins b)
adamc@94 500 end.
adamc@94 501
adamc@94 502 Definition insertResult c n :=
adamc@94 503 match c with
adamc@94 504 | Red => rbtree Black (S n)
adamc@94 505 | Black => { c' : color & rbtree c' n }
adamc@94 506 end.
adamc@94 507
adamc@97 508 Definition makeRbtree c n : insResult c n -> insertResult c n :=
adamc@94 509 match c return insResult c n -> insertResult c n with
adamc@94 510 | Red => fun r =>
adamc@94 511 match r in rtree n return insertResult Red n with
adamc@94 512 | RedNode' _ _ _ a x b => BlackNode a x b
adamc@94 513 end
adamc@94 514 | Black => fun r => r
adamc@94 515 end.
adamc@94 516
adamc@97 517 Implicit Arguments makeRbtree [c n].
adamc@94 518
adamc@94 519 Definition insert c n (t : rbtree c n) : insertResult c n :=
adamc@97 520 makeRbtree (ins t).
adamc@94 521
adamc@95 522 Section present.
adamc@95 523 Variable z : nat.
adamc@95 524
adamc@98 525 Ltac present_balance :=
adamc@98 526 crush;
adamc@98 527 repeat (match goal with
adamc@98 528 | [ H : context[match ?T with
adamc@98 529 | Leaf => _
adamc@98 530 | RedNode _ _ _ _ => _
adamc@98 531 | BlackNode _ _ _ _ _ _ => _
adamc@98 532 end] |- _ ] => dep_destruct T
adamc@98 533 | [ |- context[match ?T with
adamc@98 534 | Leaf => _
adamc@98 535 | RedNode _ _ _ _ => _
adamc@98 536 | BlackNode _ _ _ _ _ _ => _
adamc@98 537 end] ] => dep_destruct T
adamc@98 538 end; crush).
adamc@98 539
adamc@95 540 Lemma present_balance1 : forall n (a : rtree n) (y : nat) c2 (b : rbtree c2 n) ,
adamc@95 541 present z (projT2 (balance1 a y b))
adamc@95 542 <-> rpresent z a \/ z = y \/ present z b.
adamc@98 543 destruct a; present_balance.
adamc@95 544 Qed.
adamc@95 545
adamc@95 546 Lemma present_balance2 : forall n (a : rtree n) (y : nat) c2 (b : rbtree c2 n) ,
adamc@95 547 present z (projT2 (balance2 a y b))
adamc@95 548 <-> rpresent z a \/ z = y \/ present z b.
adamc@98 549 destruct a; present_balance.
adamc@95 550 Qed.
adamc@95 551
adamc@95 552 Definition present_insResult c n :=
adamc@95 553 match c return (rbtree c n -> insResult c n -> Prop) with
adamc@95 554 | Red => fun t r => rpresent z r <-> z = x \/ present z t
adamc@95 555 | Black => fun t r => present z (projT2 r) <-> z = x \/ present z t
adamc@95 556 end.
adamc@95 557
adamc@95 558 Theorem present_ins : forall c n (t : rbtree c n),
adamc@95 559 present_insResult t (ins t).
adamc@95 560 induction t; crush;
adamc@95 561 repeat (match goal with
adamc@95 562 | [ H : context[if ?E then _ else _] |- _ ] => destruct E
adamc@95 563 | [ |- context[if ?E then _ else _] ] => destruct E
adamc@95 564 | [ H : context[match ?C with Red => _ | Black => _ end] |- _ ] => destruct C
adamc@95 565 end; crush);
adamc@95 566 try match goal with
adamc@95 567 | [ H : context[balance1 ?A ?B ?C] |- _ ] =>
adamc@95 568 generalize (present_balance1 A B C)
adamc@95 569 end;
adamc@95 570 try match goal with
adamc@95 571 | [ H : context[balance2 ?A ?B ?C] |- _ ] =>
adamc@95 572 generalize (present_balance2 A B C)
adamc@95 573 end;
adamc@95 574 try match goal with
adamc@95 575 | [ |- context[balance1 ?A ?B ?C] ] =>
adamc@95 576 generalize (present_balance1 A B C)
adamc@95 577 end;
adamc@95 578 try match goal with
adamc@95 579 | [ |- context[balance2 ?A ?B ?C] ] =>
adamc@95 580 generalize (present_balance2 A B C)
adamc@95 581 end;
adamc@95 582 intuition;
adamc@95 583 match goal with
adamc@95 584 | [ z : nat, x : nat |- _ ] =>
adamc@95 585 match goal with
adamc@95 586 | [ H : z = x |- _ ] => rewrite H in *; clear H
adamc@95 587 end
adamc@95 588 end;
adamc@95 589 tauto.
adamc@95 590 Qed.
adamc@95 591
adamc@97 592 Ltac present_insert t :=
adamc@97 593 unfold insert; inversion t;
adamc@97 594 generalize (present_ins t); simpl;
adamc@97 595 dep_destruct (ins t); tauto.
adamc@97 596
adamc@95 597 Theorem present_insert_Red : forall n (t : rbtree Red n),
adamc@95 598 present z (insert t)
adamc@95 599 <-> (z = x \/ present z t).
adamc@97 600 intros; present_insert t.
adamc@95 601 Qed.
adamc@95 602
adamc@95 603 Theorem present_insert_Black : forall n (t : rbtree Black n),
adamc@95 604 present z (projT2 (insert t))
adamc@95 605 <-> (z = x \/ present z t).
adamc@97 606 intros; present_insert t.
adamc@95 607 Qed.
adamc@95 608 End present.
adamc@94 609 End insert.
adamc@94 610
adamc@94 611
adamc@86 612 (** * A Certified Regular Expression Matcher *)
adamc@86 613
adamc@93 614 (** Another interesting example is regular expressions with dependent types that express which predicates over strings particular regexps implement. We can then assign a dependent type to a regular expression matching function, guaranteeing that it always decides the string property that we expect it to decide.
adamc@93 615
adamc@93 616 Before defining the syntax of expressions, it is helpful to define an inductive type capturing the meaning of the Kleene star. We use Coq's string support, which comes through a combination of the [Strings] library and some parsing notations built into Coq. Operators like [++] and functions like [length] that we know from lists are defined again for strings. Notation scopes help us control which versions we want to use in particular contexts. *)
adamc@93 617
adamc@86 618 Require Import Ascii String.
adamc@86 619 Open Scope string_scope.
adamc@86 620
adamc@91 621 Section star.
adamc@91 622 Variable P : string -> Prop.
adamc@91 623
adamc@91 624 Inductive star : string -> Prop :=
adamc@91 625 | Empty : star ""
adamc@91 626 | Iter : forall s1 s2,
adamc@91 627 P s1
adamc@91 628 -> star s2
adamc@91 629 -> star (s1 ++ s2).
adamc@91 630 End star.
adamc@91 631
adamc@93 632 (** Now we can make our first attempt at defining a [regexp] type that is indexed by predicates on strings. Here is a reasonable-looking definition that is restricted to constant characters and concatenation.
adamc@93 633
adamc@93 634 [[
adamc@93 635 Inductive regexp : (string -> Prop) -> Set :=
adamc@93 636 | Char : forall ch : ascii,
adamc@93 637 regexp (fun s => s = String ch "")
adamc@93 638 | Concat : forall (P1 P2 : string -> Prop) (r1 : regexp P1) (r2 : regexp P2),
adamc@93 639 regexp (fun s => exists s1, exists s2, s = s1 ++ s2 /\ P1 s1 /\ P2 s2).
adamc@93 640
adamc@93 641 [[
adamc@93 642 User error: Large non-propositional inductive types must be in Type
adamc@93 643 ]]
adamc@93 644
adamc@93 645 What is a large inductive type? In Coq, it is an inductive type that has a constructor which quantifies over some type of type [Type]. We have not worked with [Type] very much to this point. Every term of CIC has a type, including [Set] and [Prop], which are assigned type [Type]. The type [string -> Prop] from the failed definition also has type [Type].
adamc@93 646
adamc@93 647 It turns out that allowing large inductive types in [Set] leads to contradictions when combined with certain kinds of classical logic reasoning. Thus, by default, such types are ruled out. There is a simple fix for our [regexp] definition, which is to place our new type in [Type]. While fixing the problem, we also expand the list of constructors to cover the remaining regular expression operators. *)
adamc@93 648
adamc@89 649 Inductive regexp : (string -> Prop) -> Type :=
adamc@86 650 | Char : forall ch : ascii,
adamc@86 651 regexp (fun s => s = String ch "")
adamc@86 652 | Concat : forall P1 P2 (r1 : regexp P1) (r2 : regexp P2),
adamc@87 653 regexp (fun s => exists s1, exists s2, s = s1 ++ s2 /\ P1 s1 /\ P2 s2)
adamc@87 654 | Or : forall P1 P2 (r1 : regexp P1) (r2 : regexp P2),
adamc@91 655 regexp (fun s => P1 s \/ P2 s)
adamc@91 656 | Star : forall P (r : regexp P),
adamc@91 657 regexp (star P).
adamc@86 658
adamc@93 659 (** Many theorems about strings are useful for implementing a certified regexp matcher, and few of them are in the [Strings] library. The book source includes statements, proofs, and hint commands for a handful of such omittted theorems. Since they are orthogonal to our use of dependent types, we hide them in the rendered versions of this book. *)
adamc@93 660
adamc@93 661 (* begin hide *)
adamc@86 662 Open Scope specif_scope.
adamc@86 663
adamc@86 664 Lemma length_emp : length "" <= 0.
adamc@86 665 crush.
adamc@86 666 Qed.
adamc@86 667
adamc@86 668 Lemma append_emp : forall s, s = "" ++ s.
adamc@86 669 crush.
adamc@86 670 Qed.
adamc@86 671
adamc@86 672 Ltac substring :=
adamc@86 673 crush;
adamc@86 674 repeat match goal with
adamc@86 675 | [ |- context[match ?N with O => _ | S _ => _ end] ] => destruct N; crush
adamc@86 676 end.
adamc@86 677
adamc@86 678 Lemma substring_le : forall s n m,
adamc@86 679 length (substring n m s) <= m.
adamc@86 680 induction s; substring.
adamc@86 681 Qed.
adamc@86 682
adamc@86 683 Lemma substring_all : forall s,
adamc@86 684 substring 0 (length s) s = s.
adamc@86 685 induction s; substring.
adamc@86 686 Qed.
adamc@86 687
adamc@86 688 Lemma substring_none : forall s n,
adamc@93 689 substring n 0 s = "".
adamc@86 690 induction s; substring.
adamc@86 691 Qed.
adamc@86 692
adamc@86 693 Hint Rewrite substring_all substring_none : cpdt.
adamc@86 694
adamc@86 695 Lemma substring_split : forall s m,
adamc@86 696 substring 0 m s ++ substring m (length s - m) s = s.
adamc@86 697 induction s; substring.
adamc@86 698 Qed.
adamc@86 699
adamc@86 700 Lemma length_app1 : forall s1 s2,
adamc@86 701 length s1 <= length (s1 ++ s2).
adamc@86 702 induction s1; crush.
adamc@86 703 Qed.
adamc@86 704
adamc@86 705 Hint Resolve length_emp append_emp substring_le substring_split length_app1.
adamc@86 706
adamc@86 707 Lemma substring_app_fst : forall s2 s1 n,
adamc@86 708 length s1 = n
adamc@86 709 -> substring 0 n (s1 ++ s2) = s1.
adamc@86 710 induction s1; crush.
adamc@86 711 Qed.
adamc@86 712
adamc@86 713 Lemma substring_app_snd : forall s2 s1 n,
adamc@86 714 length s1 = n
adamc@86 715 -> substring n (length (s1 ++ s2) - n) (s1 ++ s2) = s2.
adamc@86 716 Hint Rewrite <- minus_n_O : cpdt.
adamc@86 717
adamc@86 718 induction s1; crush.
adamc@86 719 Qed.
adamc@86 720
adamc@91 721 Hint Rewrite substring_app_fst substring_app_snd using (trivial; fail) : cpdt.
adamc@93 722 (* end hide *)
adamc@93 723
adamc@93 724 (** A few auxiliary functions help us in our final matcher definition. The function [split] will be used to implement the regexp concatenation case. *)
adamc@86 725
adamc@86 726 Section split.
adamc@86 727 Variables P1 P2 : string -> Prop.
adamc@91 728 Variable P1_dec : forall s, {P1 s} + { ~P1 s}.
adamc@91 729 Variable P2_dec : forall s, {P2 s} + { ~P2 s}.
adamc@93 730 (** We require a choice of two arbitrary string predicates and functions for deciding them. *)
adamc@86 731
adamc@86 732 Variable s : string.
adamc@93 733 (** Our computation will take place relative to a single fixed string, so it is easiest to make it a [Variable], rather than an explicit argument to our functions. *)
adamc@93 734
adamc@93 735 (** [split'] is the workhorse behind [split]. It searches through the possible ways of splitting [s] into two pieces, checking the two predicates against each such pair. [split'] progresses right-to-left, from splitting all of [s] into the first piece to splitting all of [s] into the second piece. It takes an extra argument, [n], which specifies how far along we are in this search process. *)
adamc@86 736
adamc@86 737 Definition split' (n : nat) : n <= length s
adamc@86 738 -> {exists s1, exists s2, length s1 <= n /\ s1 ++ s2 = s /\ P1 s1 /\ P2 s2}
adamc@86 739 + {forall s1 s2, length s1 <= n -> s1 ++ s2 = s -> ~P1 s1 \/ ~P2 s2}.
adamc@86 740 refine (fix F (n : nat) : n <= length s
adamc@86 741 -> {exists s1, exists s2, length s1 <= n /\ s1 ++ s2 = s /\ P1 s1 /\ P2 s2}
adamc@86 742 + {forall s1 s2, length s1 <= n -> s1 ++ s2 = s -> ~P1 s1 \/ ~P2 s2} :=
adamc@86 743 match n return n <= length s
adamc@86 744 -> {exists s1, exists s2, length s1 <= n /\ s1 ++ s2 = s /\ P1 s1 /\ P2 s2}
adamc@86 745 + {forall s1 s2, length s1 <= n -> s1 ++ s2 = s -> ~P1 s1 \/ ~P2 s2} with
adamc@86 746 | O => fun _ => Reduce (P1_dec "" && P2_dec s)
adamc@93 747 | S n' => fun _ => (P1_dec (substring 0 (S n') s)
adamc@93 748 && P2_dec (substring (S n') (length s - S n') s))
adamc@86 749 || F n' _
adamc@86 750 end); clear F; crush; eauto 7;
adamc@86 751 match goal with
adamc@86 752 | [ _ : length ?S <= 0 |- _ ] => destruct S
adamc@86 753 | [ _ : length ?S' <= S ?N |- _ ] =>
adamc@86 754 generalize (eq_nat_dec (length S') (S N)); destruct 1
adamc@86 755 end; crush.
adamc@86 756 Defined.
adamc@86 757
adamc@93 758 (** There is one subtle point in the [split'] code that is worth mentioning. The main body of the function is a [match] on [n]. In the case where [n] is known to be [S n'], we write [S n'] in several places where we might be tempted to write [n]. However, without further work to craft proper [match] annotations, the type-checker does not use the equality between [n] and [S n']. Thus, it is common to see patterns repeated in [match] case bodies in dependently-typed Coq code. We can at least use a [let] expression to avoid copying the pattern more than once, replacing the first case body with:
adamc@93 759
adamc@93 760 [[
adamc@93 761
adamc@93 762 | S n' => fun _ => let n := S n' in
adamc@93 763 (P1_dec (substring 0 n s)
adamc@93 764 && P2_dec (substring n (length s - n) s))
adamc@93 765 || F n' _
adamc@93 766 ]]
adamc@93 767
adamc@93 768 [split] itself is trivial to implement in terms of [split']. We just ask [split'] to begin its search with [n = length s]. *)
adamc@93 769
adamc@86 770 Definition split : {exists s1, exists s2, s = s1 ++ s2 /\ P1 s1 /\ P2 s2}
adamc@86 771 + {forall s1 s2, s = s1 ++ s2 -> ~P1 s1 \/ ~P2 s2}.
adamc@86 772 refine (Reduce (split' (n := length s) _)); crush; eauto.
adamc@86 773 Defined.
adamc@86 774 End split.
adamc@86 775
adamc@86 776 Implicit Arguments split [P1 P2].
adamc@86 777
adamc@93 778 (* begin hide *)
adamc@91 779 Lemma app_empty_end : forall s, s ++ "" = s.
adamc@91 780 induction s; crush.
adamc@91 781 Qed.
adamc@91 782
adamc@91 783 Hint Rewrite app_empty_end : cpdt.
adamc@91 784
adamc@91 785 Lemma substring_self : forall s n,
adamc@91 786 n <= 0
adamc@91 787 -> substring n (length s - n) s = s.
adamc@91 788 induction s; substring.
adamc@91 789 Qed.
adamc@91 790
adamc@91 791 Lemma substring_empty : forall s n m,
adamc@91 792 m <= 0
adamc@91 793 -> substring n m s = "".
adamc@91 794 induction s; substring.
adamc@91 795 Qed.
adamc@91 796
adamc@91 797 Hint Rewrite substring_self substring_empty using omega : cpdt.
adamc@91 798
adamc@91 799 Lemma substring_split' : forall s n m,
adamc@91 800 substring n m s ++ substring (n + m) (length s - (n + m)) s
adamc@91 801 = substring n (length s - n) s.
adamc@91 802 Hint Rewrite substring_split : cpdt.
adamc@91 803
adamc@91 804 induction s; substring.
adamc@91 805 Qed.
adamc@91 806
adamc@91 807 Lemma substring_stack : forall s n2 m1 m2,
adamc@91 808 m1 <= m2
adamc@91 809 -> substring 0 m1 (substring n2 m2 s)
adamc@91 810 = substring n2 m1 s.
adamc@91 811 induction s; substring.
adamc@91 812 Qed.
adamc@91 813
adamc@91 814 Ltac substring' :=
adamc@91 815 crush;
adamc@91 816 repeat match goal with
adamc@91 817 | [ |- context[match ?N with O => _ | S _ => _ end] ] => case_eq N; crush
adamc@91 818 end.
adamc@91 819
adamc@91 820 Lemma substring_stack' : forall s n1 n2 m1 m2,
adamc@91 821 n1 + m1 <= m2
adamc@91 822 -> substring n1 m1 (substring n2 m2 s)
adamc@91 823 = substring (n1 + n2) m1 s.
adamc@91 824 induction s; substring';
adamc@91 825 match goal with
adamc@91 826 | [ |- substring ?N1 _ _ = substring ?N2 _ _ ] =>
adamc@91 827 replace N1 with N2; crush
adamc@91 828 end.
adamc@91 829 Qed.
adamc@91 830
adamc@91 831 Lemma substring_suffix : forall s n,
adamc@91 832 n <= length s
adamc@91 833 -> length (substring n (length s - n) s) = length s - n.
adamc@91 834 induction s; substring.
adamc@91 835 Qed.
adamc@91 836
adamc@91 837 Lemma substring_suffix_emp' : forall s n m,
adamc@91 838 substring n (S m) s = ""
adamc@91 839 -> n >= length s.
adamc@91 840 induction s; crush;
adamc@91 841 match goal with
adamc@91 842 | [ |- ?N >= _ ] => destruct N; crush
adamc@91 843 end;
adamc@91 844 match goal with
adamc@91 845 [ |- S ?N >= S ?E ] => assert (N >= E); [ eauto | omega ]
adamc@91 846 end.
adamc@91 847 Qed.
adamc@91 848
adamc@91 849 Lemma substring_suffix_emp : forall s n m,
adamc@92 850 substring n m s = ""
adamc@92 851 -> m > 0
adamc@91 852 -> n >= length s.
adamc@91 853 destruct m as [| m]; [crush | intros; apply substring_suffix_emp' with m; assumption].
adamc@91 854 Qed.
adamc@91 855
adamc@91 856 Hint Rewrite substring_stack substring_stack' substring_suffix
adamc@91 857 using omega : cpdt.
adamc@91 858
adamc@91 859 Lemma minus_minus : forall n m1 m2,
adamc@91 860 m1 + m2 <= n
adamc@91 861 -> n - m1 - m2 = n - (m1 + m2).
adamc@91 862 intros; omega.
adamc@91 863 Qed.
adamc@91 864
adamc@91 865 Lemma plus_n_Sm' : forall n m : nat, S (n + m) = m + S n.
adamc@91 866 intros; omega.
adamc@91 867 Qed.
adamc@91 868
adamc@91 869 Hint Rewrite minus_minus using omega : cpdt.
adamc@93 870 (* end hide *)
adamc@93 871
adamc@93 872 (** One more helper function will come in handy: [dec_star], for implementing another linear search through ways of splitting a string, this time for implementing the Kleene star. *)
adamc@91 873
adamc@91 874 Section dec_star.
adamc@91 875 Variable P : string -> Prop.
adamc@91 876 Variable P_dec : forall s, {P s} + { ~P s}.
adamc@91 877
adamc@93 878 (** Some new lemmas and hints about the [star] type family are useful here. We omit them here; they are included in the book source at this point. *)
adamc@93 879
adamc@93 880 (* begin hide *)
adamc@91 881 Hint Constructors star.
adamc@91 882
adamc@91 883 Lemma star_empty : forall s,
adamc@91 884 length s = 0
adamc@91 885 -> star P s.
adamc@91 886 destruct s; crush.
adamc@91 887 Qed.
adamc@91 888
adamc@91 889 Lemma star_singleton : forall s, P s -> star P s.
adamc@91 890 intros; rewrite <- (app_empty_end s); auto.
adamc@91 891 Qed.
adamc@91 892
adamc@91 893 Lemma star_app : forall s n m,
adamc@91 894 P (substring n m s)
adamc@91 895 -> star P (substring (n + m) (length s - (n + m)) s)
adamc@91 896 -> star P (substring n (length s - n) s).
adamc@91 897 induction n; substring;
adamc@91 898 match goal with
adamc@91 899 | [ H : P (substring ?N ?M ?S) |- _ ] =>
adamc@91 900 solve [ rewrite <- (substring_split S M); auto
adamc@91 901 | rewrite <- (substring_split' S N M); auto ]
adamc@91 902 end.
adamc@91 903 Qed.
adamc@91 904
adamc@91 905 Hint Resolve star_empty star_singleton star_app.
adamc@91 906
adamc@91 907 Variable s : string.
adamc@91 908
adamc@91 909 Lemma star_inv : forall s,
adamc@91 910 star P s
adamc@91 911 -> s = ""
adamc@91 912 \/ exists i, i < length s
adamc@91 913 /\ P (substring 0 (S i) s)
adamc@91 914 /\ star P (substring (S i) (length s - S i) s).
adamc@91 915 Hint Extern 1 (exists i : nat, _) =>
adamc@91 916 match goal with
adamc@91 917 | [ H : P (String _ ?S) |- _ ] => exists (length S); crush
adamc@91 918 end.
adamc@91 919
adamc@91 920 induction 1; [
adamc@91 921 crush
adamc@91 922 | match goal with
adamc@91 923 | [ _ : P ?S |- _ ] => destruct S; crush
adamc@91 924 end
adamc@91 925 ].
adamc@91 926 Qed.
adamc@91 927
adamc@91 928 Lemma star_substring_inv : forall n,
adamc@91 929 n <= length s
adamc@91 930 -> star P (substring n (length s - n) s)
adamc@91 931 -> substring n (length s - n) s = ""
adamc@91 932 \/ exists l, l < length s - n
adamc@91 933 /\ P (substring n (S l) s)
adamc@91 934 /\ star P (substring (n + S l) (length s - (n + S l)) s).
adamc@91 935 Hint Rewrite plus_n_Sm' : cpdt.
adamc@91 936
adamc@91 937 intros;
adamc@91 938 match goal with
adamc@91 939 | [ H : star _ _ |- _ ] => generalize (star_inv H); do 3 crush; eauto
adamc@91 940 end.
adamc@91 941 Qed.
adamc@93 942 (* end hide *)
adamc@93 943
adamc@93 944 (** The function [dec_star''] implements a single iteration of the star. That is, it tries to find a string prefix matching [P], and it calls a parameter function on the remainder of the string. *)
adamc@91 945
adamc@91 946 Section dec_star''.
adamc@91 947 Variable n : nat.
adamc@93 948 (** [n] is the length of the prefix of [s] that we have already processed. *)
adamc@91 949
adamc@91 950 Variable P' : string -> Prop.
adamc@91 951 Variable P'_dec : forall n' : nat, n' > n
adamc@91 952 -> {P' (substring n' (length s - n') s)}
adamc@91 953 + { ~P' (substring n' (length s - n') s)}.
adamc@93 954 (** When we use [dec_star''], we will instantiate [P'_dec] with a function for continuing the search for more instances of [P] in [s]. *)
adamc@93 955
adamc@93 956 (** Now we come to [dec_star''] itself. It takes as an input a natural [l] that records how much of the string has been searched so far, as we did for [split']. The return type expresses that [dec_star''] is looking for an index into [s] that splits [s] into a nonempty prefix and a suffix, such that the prefix satisfies [P] and the suffix satisfies [P']. *)
adamc@91 957
adamc@91 958 Definition dec_star'' (l : nat)
adamc@91 959 : {exists l', S l' <= l
adamc@91 960 /\ P (substring n (S l') s) /\ P' (substring (n + S l') (length s - (n + S l')) s)}
adamc@91 961 + {forall l', S l' <= l
adamc@93 962 -> ~P (substring n (S l') s)
adamc@93 963 \/ ~P' (substring (n + S l') (length s - (n + S l')) s)}.
adamc@91 964 refine (fix F (l : nat) : {exists l', S l' <= l
adamc@91 965 /\ P (substring n (S l') s) /\ P' (substring (n + S l') (length s - (n + S l')) s)}
adamc@91 966 + {forall l', S l' <= l
adamc@93 967 -> ~P (substring n (S l') s)
adamc@93 968 \/ ~P' (substring (n + S l') (length s - (n + S l')) s)} :=
adamc@91 969 match l return {exists l', S l' <= l
adamc@91 970 /\ P (substring n (S l') s) /\ P' (substring (n + S l') (length s - (n + S l')) s)}
adamc@91 971 + {forall l', S l' <= l ->
adamc@93 972 ~P (substring n (S l') s)
adamc@93 973 \/ ~P' (substring (n + S l') (length s - (n + S l')) s)} with
adamc@91 974 | O => _
adamc@91 975 | S l' =>
adamc@91 976 (P_dec (substring n (S l') s) && P'_dec (n' := n + S l') _)
adamc@91 977 || F l'
adamc@91 978 end); clear F; crush; eauto 7;
adamc@91 979 match goal with
adamc@91 980 | [ H : ?X <= S ?Y |- _ ] => destruct (eq_nat_dec X (S Y)); crush
adamc@91 981 end.
adamc@91 982 Defined.
adamc@91 983 End dec_star''.
adamc@91 984
adamc@93 985 (* begin hide *)
adamc@92 986 Lemma star_length_contra : forall n,
adamc@92 987 length s > n
adamc@92 988 -> n >= length s
adamc@92 989 -> False.
adamc@92 990 crush.
adamc@92 991 Qed.
adamc@92 992
adamc@92 993 Lemma star_length_flip : forall n n',
adamc@92 994 length s - n <= S n'
adamc@92 995 -> length s > n
adamc@92 996 -> length s - n > 0.
adamc@92 997 crush.
adamc@92 998 Qed.
adamc@92 999
adamc@92 1000 Hint Resolve star_length_contra star_length_flip substring_suffix_emp.
adamc@93 1001 (* end hide *)
adamc@92 1002
adamc@93 1003 (** The work of [dec_star''] is nested inside another linear search by [dec_star'], which provides the final functionality we need, but for arbitrary suffixes of [s], rather than just for [s] overall. *)
adamc@93 1004
adamc@91 1005 Definition dec_star' (n n' : nat) : length s - n' <= n
adamc@91 1006 -> {star P (substring n' (length s - n') s)}
adamc@93 1007 + { ~star P (substring n' (length s - n') s)}.
adamc@91 1008 refine (fix F (n n' : nat) {struct n} : length s - n' <= n
adamc@91 1009 -> {star P (substring n' (length s - n') s)}
adamc@93 1010 + { ~star P (substring n' (length s - n') s)} :=
adamc@91 1011 match n return length s - n' <= n
adamc@91 1012 -> {star P (substring n' (length s - n') s)}
adamc@93 1013 + { ~star P (substring n' (length s - n') s)} with
adamc@91 1014 | O => fun _ => Yes
adamc@91 1015 | S n'' => fun _ =>
adamc@91 1016 le_gt_dec (length s) n'
adamc@91 1017 || dec_star'' (n := n') (star P) (fun n0 _ => Reduce (F n'' n0 _)) (length s - n')
adamc@92 1018 end); clear F; crush; eauto;
adamc@92 1019 match goal with
adamc@92 1020 | [ H : star _ _ |- _ ] => apply star_substring_inv in H; crush; eauto
adamc@92 1021 end;
adamc@92 1022 match goal with
adamc@92 1023 | [ H1 : _ < _ - _, H2 : forall l' : nat, _ <= _ - _ -> _ |- _ ] =>
adamc@92 1024 generalize (H2 _ (lt_le_S _ _ H1)); tauto
adamc@92 1025 end.
adamc@91 1026 Defined.
adamc@91 1027
adamc@93 1028 (** Finally, we have [dec_star]. It has a straightforward implementation. We introduce a spurious match on [s] so that [simpl] will know to reduce calls to [dec_star]. The heuristic that [simpl] uses is only to unfold identifier definitions when doing so would simplify some [match] expression. *)
adamc@93 1029
adamc@91 1030 Definition dec_star : {star P s} + { ~star P s}.
adamc@91 1031 refine (match s with
adamc@91 1032 | "" => Reduce (dec_star' (n := length s) 0 _)
adamc@91 1033 | _ => Reduce (dec_star' (n := length s) 0 _)
adamc@91 1034 end); crush.
adamc@91 1035 Defined.
adamc@91 1036 End dec_star.
adamc@91 1037
adamc@93 1038 (* begin hide *)
adamc@86 1039 Lemma app_cong : forall x1 y1 x2 y2,
adamc@86 1040 x1 = x2
adamc@86 1041 -> y1 = y2
adamc@86 1042 -> x1 ++ y1 = x2 ++ y2.
adamc@86 1043 congruence.
adamc@86 1044 Qed.
adamc@86 1045
adamc@86 1046 Hint Resolve app_cong.
adamc@93 1047 (* end hide *)
adamc@93 1048
adamc@93 1049 (** With these helper functions completed, the implementation of our [matches] function is refreshingly straightforward. We only need one small piece of specific tactic work beyond what [crush] does for us. *)
adamc@86 1050
adamc@86 1051 Definition matches P (r : regexp P) s : {P s} + { ~P s}.
adamc@86 1052 refine (fix F P (r : regexp P) s : {P s} + { ~P s} :=
adamc@86 1053 match r with
adamc@86 1054 | Char ch => string_dec s (String ch "")
adamc@86 1055 | Concat _ _ r1 r2 => Reduce (split (F _ r1) (F _ r2) s)
adamc@87 1056 | Or _ _ r1 r2 => F _ r1 s || F _ r2 s
adamc@91 1057 | Star _ r => dec_star _ _ _
adamc@86 1058 end); crush;
adamc@86 1059 match goal with
adamc@86 1060 | [ H : _ |- _ ] => generalize (H _ _ (refl_equal _))
adamc@93 1061 end; tauto.
adamc@86 1062 Defined.
adamc@86 1063
adamc@93 1064 (* begin hide *)
adamc@86 1065 Example hi := Concat (Char "h"%char) (Char "i"%char).
adamc@86 1066 Eval simpl in matches hi "hi".
adamc@86 1067 Eval simpl in matches hi "bye".
adamc@87 1068
adamc@87 1069 Example a_b := Or (Char "a"%char) (Char "b"%char).
adamc@87 1070 Eval simpl in matches a_b "".
adamc@87 1071 Eval simpl in matches a_b "a".
adamc@87 1072 Eval simpl in matches a_b "aa".
adamc@87 1073 Eval simpl in matches a_b "b".
adamc@91 1074
adamc@91 1075 Example a_star := Star (Char "a"%char).
adamc@91 1076 Eval simpl in matches a_star "".
adamc@91 1077 Eval simpl in matches a_star "a".
adamc@91 1078 Eval simpl in matches a_star "b".
adamc@91 1079 Eval simpl in matches a_star "aa".
adamc@93 1080 (* end hide *)