Enchanting Program Specification Synthesis by Large Language Models using Static Analysis and Program Verification

Formal verification provides a rigorous and systematic approach to ensure the correctness and reliability of software systems. Yet, constructing specifications for the full proof relies on domain expertise and non-trivial manpower. In view of such needs, an automated approach for specification synthesis is desired. While existing automated approaches are limited in their versatility, i.e., they either focus only on synthesizing loop invariants for numerical programs, or are tailored for specific types of programs or invariants. Programs involving multiple complicated data types (e.g., arrays, pointers) and code structures (e.g., nested loops, function calls) are often beyond their capabilities. To help bridge this gap, we present AutoSpec, an automated approach to synthesize specifications for automated program verification. It overcomes the shortcomings of existing work in specification versatility, synthesizing satisfiable and adequate specifications for full proof. It is driven by static analysis and program verification, and is empowered by large language models (LLMs). AutoSpec addresses the practical challenges in three ways: (1) driving \name by static analysis and program verification, LLMs serve as generators to generate candidate specifications, (2) programs are decomposed to direct the attention of LLMs, and (3) candidate specifications are validated in each round to avoid error accumulation during the interaction with LLMs. In this way, AutoSpec can incrementally and iteratively generate satisfiable and adequate specifications. The evaluation shows its effectiveness and usefulness, as it outperforms existing works by successfully verifying 79% of programs through automatic specification synthesis, a significant improvement of 1.592x. It can also be successfully applied to verify the programs in a real-world X509-parser project.

翻译：形式化验证为确保软件系统的正确性和可靠性提供了一种严谨系统的方法。然而，为完整证明构建规约依赖于领域知识和大量人力投入。鉴于这些需求，自动化的规约合成方法尤为重要。现有自动化方法在通用性方面存在局限，即它们要么仅专注于数值程序的循环不变量合成，要么针对特定类型程序或不变量进行定制。涉及多种复杂数据类型（如数组、指针）和代码结构（如嵌套循环、函数调用）的程序往往超出其能力范围。为弥补这一差距，我们提出AutoSpec，一种面向自动化程序验证的规约合成方法。它克服了现有工作在规约通用性方面的不足，可合成可满足且充分的规约以完成完整证明。该方法由静态分析和程序验证驱动，并借助大语言模型（LLMs）增强能力。AutoSpec通过三种方式解决实际挑战：（1）利用静态分析和程序验证驱动规约生成，LLMs作为生成器生成候选规约；（2）对程序进行分解以引导LLMs的注意力；（3）每轮对候选规约进行验证，避免在与LLMs交互过程中的错误累积。由此，AutoSpec能够增量式、迭代地生成可满足且充分的规约。实验评估表明其有效性和实用性：通过自动规约合成成功验证了79%的程序，相比现有方法提升1.592倍。该方法还可成功应用于实际X509解析器项目中程序的验证。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日