Automated program verification has always been an important component of building trustworthy software. While the analysis of real-world programs remains a theoretical challenge, the automation of loop invariant analysis has effectively resolved the problem. However, real-world programs that often mix complex data structures and control flows pose challenges to traditional loop invariant generation tools. To enhance the applicability of invariant generation techniques, we proposed ACInv, an Automated Complex program loop Invariant generation tool, which combines static analysis with Large Language Models (LLMs) to generate the proper loop invariants. We utilize static analysis to extract the necessary information for each loop and embed it into prompts for the LLM to generate invariants for each loop. Subsequently, we employ an LLM-based evaluator to assess the generated invariants, refining them by either strengthening, weakening, or rejecting them based on their correctness, ultimately obtaining enhanced invariants. We conducted experiments on ACInv, which showed that ACInv outperformed previous tools on data sets with data structures, and maintained similar performance to the state-of-the-art tool AutoSpec on numerical programs without data structures. For the total data set, ACInv can solve 21% more examples than AutoSpec and can generate reference data structure templates.
翻译:自动化程序验证一直是构建可信软件的重要组成部分。尽管现实世界程序的分析在理论上仍具挑战,但循环不变量分析的自动化已有效解决了该问题。然而,现实程序中常混合复杂数据结构与控制流,这对传统循环不变量生成工具提出了挑战。为提升不变量生成技术的适用性,我们提出了ACInv——一种自动复杂程序循环不变量生成工具,它结合静态分析与大型语言模型来生成合适的循环不变量。我们利用静态分析提取每个循环的必要信息,并将其嵌入提示词中,供LLM为每个循环生成不变量。随后,我们采用基于LLM的评估器对生成的不变量进行评价,根据其正确性通过强化、弱化或拒绝的方式进行精炼,最终获得增强的不变量。我们在ACInv上进行了实验,结果表明ACInv在包含数据结构的数据集上优于先前工具,在无数据结构的数值程序上则保持了与最先进工具AutoSpec相近的性能。在整个数据集上,ACInv能比AutoSpec多解决21%的示例,并能生成参考数据结构模板。