Foundational Integration Verification of a Cryptographic Server

Andres Erbsen, Jade Philipoom, Dustin Jamner, Ashley Lin, Samuel Gruetter, Clément Pit-Claudel, Adam Chlipala. Foundational Integration Verification of a Cryptographic Server. Proceedings of the 45th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'24). June 2024.

Coming soon!

Some important dimensions of a formal verification might be called its stack height, or how many layers of a system are proved as a whole; tool diversity, or heterogeneity of tools used to verify different parts of a system; and foundationalness, or minimality of the tool-implementation code that must be trusted to believe verification results. This paper presents our verification case study that is, to our knowledge, the first to reach our level of tool diversity while covering all software installed in a demonstration computer system and roughly maxing out foundationalness within a proof assistant. That is, our unified proof is checked by Coq, and its statement depends on neither syntax nor semantics for any programming language but machine language, nor do we leave any compiler or verification tool as trusted, beyond Coq's kernel proof checker. Our case study is a simple cryptographic server for flipping of a bit of state through authenticated network messages, adopting code conventions of embedded systems, like static bounds on memory usage. We believe it is the first formal verification covering a full software stack that models both network input and output. Perhaps most distinctive is that we have chosen quite different programming languages and verification tools for different parts of the software stack (e.g., some libraries are proved as functional programs with equational reasoning, and others are proved as C-like programs or even machine code with separation logic), yet all of the correctness theorems for libraries and layers are integrated into one foundational result. We developed a unified specification style that is able to support that diversity well, and we experienced widely varying productivity with the different verification styles we employed, some of which we recommend for future work and some of which we do not.