SpecGen: Automated Generation of Formal Program Specifications via Large Language Models

Formal program specifications play a crucial role in various stages of software development. However, manually crafting formal program specifications is rather difficult, making the job time-consuming and labor-intensive. It is even more challenging to write specifications that correctly and comprehensively describe the semantics of complex programs. To reduce the burden on software developers, automated specification generation methods have emerged. However, existing methods usually rely on predefined templates or grammar, making them struggle to accurately describe the behavior and functionality of complex real-world programs. To tackle this challenge, we introduce SpecGen, a novel technique for formal program specification generation based on Large Language Models. Our key insight is to overcome the limitations of existing methods by leveraging the code comprehension capability of LLMs. The process of SpecGen consists of two phases. The first phase employs a conversational approach that guides the LLM to generate appropriate specifications for a given program. The second phase, designed for where the LLM fails to generate correct specifications, applies four mutation operators to the model-generated specifications and selects verifiable specifications from the mutated ones through a novel heuristic selection strategy. We evaluate SpecGen on two datasets, including the SV-COMP Java category benchmark and a manually constructed dataset. Experimental results demonstrate that SpecGen succeeds in generating verifiable specifications for 279 out of 385 programs, outperforming the existing purely LLM-based approaches and conventional specification generation tools like Houdini and Daikon. Further investigations on the quality of generated specifications indicate that SpecGen can comprehensively articulate the behaviors of the input program.

翻译：形式程序规约在软件开发的各个阶段都发挥着关键作用。然而，手动编写形式程序规约相当困难，导致这项工作耗时费力。更难的是编写能够正确且完整描述复杂程序语义的规约。为减轻软件开发人员的负担，自动化规约生成方法应运而生。然而，现有方法通常依赖预定义模板或语法，难以准确描述复杂真实世界程序的行为和功能。为应对这一挑战，我们提出SpecGen——一种基于大语言模型的新型形式程序规约生成技术。其核心洞察在于利用LLM的代码理解能力来突破现有方法的局限性。SpecGen的流程包含两个阶段：第一阶段采用对话式方法，引导LLM为给定程序生成合适的规约；第二阶段针对LLM生成规约失败的情况，对模型生成的规约应用四种变异算子，并通过新颖的启发式选择策略从变异后的规约中筛选可验证的规约。我们在两个数据集（包括SV-COMP Java类别基准测试和一个人工构建的数据集）上评估SpecGen。实验结果表明，SpecGen为385个程序中的279个成功生成可验证规约，优于现有纯LLM方法以及Houdini、Daikon等传统规约生成工具。对生成规约质量的进一步分析表明，SpecGen能够全面阐述输入程序的行为特性。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日