The demonstrated code-understanding capability of LLMs raises the question of whether they can be used for automated program verification, a task that demands high-level abstract reasoning about program properties that is challenging for verification tools. We propose a general methodology to combine the power of LLMs and automated reasoners for automated program verification. We formally describe this methodology as a set of transition rules and prove its soundness. We instantiate the calculus as a sound automated verification procedure and demonstrate practical improvements on a set of synthetic and competition benchmarks.
翻译:大规模语言模型(LLM)在代码理解方面所展现的能力引发了一个问题:它们是否可被用于自动化程序验证——这一任务要求对程序属性进行高度抽象的推理,而这对验证工具而言颇具挑战。我们提出了一种通用方法论,旨在融合LLM与自动化推理器的能力以支持自动化程序验证。本文以一组转换规则的形式形式化描述了该方法论,并证明了其可靠性。我们将该演算实例化为一个可靠的自动化验证流程,并在合成基准与竞赛基准上展示了其实际性能提升。