Adam Chlipala - Research Interests & Projects
As recently as the 1980s, it was normal for "real-world" programmers to laugh at the idea of structured programming languages with such arcane features as "while" loops, as opposed to the jump instructions of assembly languages. How could a program written in a structured language possibly perform well enough, compared to hand-written assembly language? Those crazy academics are at it again, with their nutty theories. Of course, today the abstraction of structured control flow is ubiquitous. Data abstraction was another that seemed esoteric at first but has become completely standard now. Serious programmers today will laugh at anyone who uniformly writes off these techniques that seemed like theoretical distractions in 1980.
The driving thesis of my research is that machine-checked mathematical proofs are the next big enabler of practical program structuring techniques. Most observers today think that formal methods are academic curiosities that will never reduce software development costs so much that it seems silly not to use them. Those observers are as wrong as the 1980 critics of structured programming.
The serious programs of 2050 will rely on formal specifications and proofs the way that Java programs today rely on static types and interfaces. We know that our only hope to build complex systems is to find ways to break them into smaller systems, with effective ways to think about parts in isolation, viewing their dependencies abstractly through interfaces, formal or not. Proof assistant systems like Coq provide the ultimate level of flexibility and rigor in formulating decomposition strategies. Decades from now, this kind of formal reasoning will be as essential as the operating-systems abstractions developed starting in the 1950s. I've often felt jealous of the pioneers who had the chance to contribute to the science of computer systems in that time period, but I feel like those of us getting started today have a second shot at it, designing system structuring principles based on mechanized proof.
My research group is gradually building a proof-of-concept development ecosystem. Our results so far span the abstraction spectrum from mathematical specifications to functional programs to hardware circuits. Some pieces we have already connected together. Others are separate at the moment, but the end goal is to connect everything together into the computing platform of the future, replacing the ideas embodied today in, say, a Linux system running on commodity hardware, including the source code used to construct all of its pieces. This future system would be:
- Foundational: The actual running software and hardware are an implementation of some high-level specification. Alternatively, if you like, they are a realization of some reference implementation chosen to be as concise and understandable as possible. The implementation should be proved from first principles to follow the spec, assuming little more than basic axioms of logical reasoning.
- Flexible: Those guarantees should apply despite supporting a flexible open-source ecosystem, where programmers may choose whatever languages and tools they like, in the end compiling both source code and proofs about it to assembly language or some similarly low-level format. The common format may even be hardware designs, where future applications come packaged with circuit descriptions and proofs that they are reasonable to run.
- Productive: The human cost of development should be as low as possible. New techniques for abstraction and modularity are the bread and butter for this dimension, and machine-checked proofs are our secret weapon for designing them.
- High-performance: Productivity shouldn't require giving up on acceptable execution speed, memory usage, power usage, etc. Advanced compilation and synthesis techniques should remove runtime costs of high-level program structuring tools.
- Secure: Ideally, new programming methods will make common security problems impossible by construction, and it should at least be possible to prove that programs follow sensible security policies.
With Published Papers
- Ur/Web is a practical functional programming language for coding modern Web applications. It's not yet connected to the formal Coq stuff from the other projects described here, but the language design is inspired by dependent type theory and Coq, and it is already used in commercial applications. Papers describe the type system behind the metaprogramming approach of the core Ur language, the modularity and concurrency features of the Ur/Web domain-specific language, the whole-program optimizing compiler, and a static security analysis for Ur/Web programs.
- Bedrock is the core of our work on an ecosystem for verified software development. On one level, it's a Coq library defining the syntax and semantics of a common assembly language, plus a notion of code libraries packaged with specifications and proofs. More importantly, though, this common foundation enables the development of many different proved-correct and proof-generating tools, all producing code modules that can be linked together to produce whole-program theorems from first principles. We've worked on proof automation for correctness of low-level pointer-manipulating code, both via proof-generating procedures and proved-correct procedures; on integrating verification within an extensible C-like programming language; on a verified compiler to Bedrock from a C-like language with built-in data abstraction; and on a case study applying Bedrock to verify a directory server running on semi-autonomous robots.
- Fiat is our new framework for deductive program synthesis in Coq. That is, we derive programs from their specifications, in a way that guarantees correctness, without needing to trust much beyond the usual Coq proof checker. We repurpose Coq's proof-automation support to do highly automatic derivation of programs in specific domains. Our published examples so far deal with abstract data types that support particular SQL-style queries. We have some experimental code that connects Fiat to Bedrock, producing correct-by-construction assembly code from simple specs almost automatically; and we are working on additional specification domains where we can automate derivations effectively, such as grammars and parsing.
- We're also looking at how to take this style of reasoning to distributed programs, as with data structures replicated across multiple servers. Some preliminary work is in causal consistency of distributed key-value stores, and in this area I'm most interested in future work on verifying ACID distributed transactions.
- In collaboration with Arvind, we're looking at modular Coq verification of Bluespec-style hardware designs. For instance, one might want to verify a memory system and a processor independently, and then compose the proofs opaquely to derive that a system with N processors and an M-deep memory hierarchy implements a particular ISA. Our current work builds on past experiments with verified compilation for mini-Bluespec. We've completed a a modular proof of a particular infinite family of multicore hardware designs, and ongoing work is about defining a Coq framework to make that sort of proof more routine, also connecting back to verified compilation of hardware designs.
- In collaboration with Frans Kaashoek and Nickolai Zeldovich, we're looking at modular verification of filesystem software as functional Coq programs, split appropriately into layers with their own formal interfaces, and considering the challenges of sudden hardware failures and asynchronous disk operations. The first stage of the project led to the FSCQ verified file system and Crash Hoare Logic; we're working now on verifying various kinds of file-system optimizations, especially based on different flavors of concurrency.
Hot Off the Presses
- We are designing a suitable foundation for a Bedrock system that supports multicore concurrent programs with fine-grained synchronization. The key challenge (which others are looking at, too) is how to support modular proofs of parts of programs in isolation, providing elegant notions of specifications and proofs. We're aiming for the same level of rigor and proof automation that Bedrock already provides for sequential programs.
- We're also building Web-based tools for teaching formal logic via Coq exercises embedded in pages.
Gauging Fit in My Group
Interested in joining the group or otherwise collaborating? Feel free to contact me if you have questions about what we could work together on. In recruiting students and postdocs, I'm generally looking for solid background in both implementation and theory. Experience with statically typed functional programming languages like OCaml and Haskell is also pretty important, and Coq experience is useful. I'll at least want to be convinced that potential students can get up to speed quickly with both kinds of tools, and the best way to do that is get started using them and show me something that you've built!
A good online introduction to Coq is Software Foundations (many undergraduates hunker down and learn from it independently over a summer or so), and my book introduces Coq for a more advanced audience, covering more of the tricky features that pay off in large developments.