Abstract: In this talk, I explore how automated reasoning techniques can enhance the safety and reliability of neural network–enabled systems. I discuss two complementary approaches: direct verification and output verification. Direct verification focuses on formulating input–output specifications and proving that a neural network satisfies them. This approach is particularly relevant for non-generative ML systems deployed in mission-critical domains such as robotics and autonomous systems. I will describe Marabou, a formal analyzer of neural networks, and highlight several of its recent applications. In contrast, output verification targets ML systems that generate complex artifacts with formal semantics, such as programs or logical formulas. Because correctness depends on the semantics of the generated artifacts, different domain-specific automated reasoning techniques are required. As a case study, I will describe how we recently applied formal verification to evaluate state-of-the-art Text-to-SQL methods, which revealed insights that standard test-based evaluations overlooked.
Bio: Andrew (Haoze) Wu is an Assistant Professor in Computer Science at Amherst College and an affiliated researcher at VMware by Broadcom. He completed his PhD in Computer Science at Stanford University in 2024. Andrew's work aims to bring together automated reasoning and machine learning to create safer and smarter computer systems. He is the lead developer of the Marabou framework, a state-of-the-art neural network verification tool widely used in academia and industry. His work has been published in top venues in formal methods and artificial intelligence.