Not all software is perfect—many apps, programs, and websites are released despite bugs. But the software behind critical systems like cryptographic protocols, medical devices, and space shuttles must be error-free, and ensuring the absence of bugs requires going beyond code reviews and testing. It requires formal verification.
Formal verification involves writing a mathematical proof of your code and is “one of the hardest but also most powerful ways of making sure your code is correct,” says Yuriy Brun, a professorat the University of Massachusetts Amherst.
To make formal verification easier, Brun and his colleagues devised a new AI-powered method called Baldur to automatically generate proofs. The accompanying paper, presented in December 2023 at the ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering in San Francisco, won a Distinguished Paper award. The team includes Emily First, who completed the study as part of her doctoral dissertation at UMass Amherst; Markus Rabe, a former researcher at Google, where the study was conducted; and Talia Ringer, an assistant professor at the University of Illinois Urbana-Champaign.