The demonstrated code-understanding capability of LLMs raises the question of whether they can be used for automated program verification, a task that often demands high-level abstract reasoning about program properties, which is challenging for verification tools. We propose a general methodology to combine the power of LLMs and automated reasoners for automated program verification. We formally describe this methodology as a set of derivation rules and prove its soundness. We instantiate the calculus as a sound automated verification procedure, which led to practical improvements on a set of synthetic and competition benchmarks.
翻译:大型语言模型(LLM)所展现的代码理解能力引发了对其能否用于自动化程序验证的探讨——该任务常需对程序属性进行高级抽象推理,这对现有验证工具颇具挑战性。我们提出一种通用方法,将LLM与自动化推理器的能力相结合以实现程序自动验证。我们以一组推导规则形式形式化描述该方法,并证明其可靠性。我们将该演算体系实例化为可靠的自动化验证流程,并在合成基准及竞赛基准集上取得了实际性改进。