Symbolic execution is a program analysis technique executing programs with symbolic instead of concrete inputs. This principle allows for exploring many program paths at once. Despite its wide adoption -- in particular for program testing -- little effort was dedicated to studying the semantic foundations of symbolic execution. Without these foundations, critical questions regarding the correctness of symbolic executors cannot be satisfyingly answered: Can a reported bug be reproduced, or is it a false positive (soundness)? Can we be sure to find all bugs if we let the testing tool run long enough (completeness)? This paper presents a systematic approach for engineering provably sound and complete symbolic execution-based bug finders by relating a programming language's operational semantics with a symbolic semantics. In contrast to prior work on symbolic execution semantics, we address the correctness of critical implementation details of symbolic bug finders, including the search strategy and the role of constraint solvers to prune the search space. We showcase our approach by implementing WiSE, a prototype of a verified bug finder for an imperative language, in the Coq proof assistant and proving it sound and complete. We demonstrate that the design principles of WiSE survive outside the ecosystem of interactive proof assistants by (1) automatically extracting an OCaml implementation and (2) transforming WiSE to PyWiSE, a functionally equivalent Python version.
翻译:符号执行是一种使用符号输入而非具体输入执行程序的分析技术。该原理允许一次性探索大量程序路径。尽管符号执行被广泛采用——特别是在程序测试领域——但针对其语义基础的研究投入却十分有限。缺乏这些基础,关于符号执行器正确性的关键问题便无法得到令人满意的解答:报告的缺陷是否可复现,抑或属于误报(可靠性)?若测试工具运行时间足够长,能否确保发现所有缺陷(完备性)?本文提出了一种系统性方法,通过将编程语言的操作语义与符号语义相关联,来工程化可证明可靠且完备的基于符号执行的缺陷检测器。与先前关于符号执行语义的研究不同,我们重点关注符号缺陷检测器中关键实现细节的正确性,包括搜索策略以及约束求解器在剪枝搜索空间中的作用。我们通过以下方式展示该方法:在Coq证明助手中实现WiSE——一款面向命令式语言的已验证缺陷检测器原型,并证明其可靠性与完备性。我们通过(1)自动提取OCaml实现和(2)将WiSE转化为功能等价的Python版本PyWiSE,证明WiSE的设计原则在交互式证明助手生态系统之外依然有效。