SpecGen: Automated Generation of Formal Program Specifications via Large Language Models

Formal program specifications play a crucial role in various stages of software development. However, manually crafting formal program specifications is rather difficult, making the job time-consuming and labor-intensive. It is even more challenging to write specifications that correctly and comprehensively describe the semantics of complex programs. To reduce the burden on software developers, automated specification generation methods have emerged. However, existing methods usually rely on predefined templates or grammar, making them struggle to accurately describe the behavior and functionality of complex real-world programs. To tackle this challenge, we introduce SpecGen, a novel technique for formal program specification generation based on Large Language Models. Our key insight is to overcome the limitations of existing methods by leveraging the code comprehension capability of LLMs. The process of SpecGen consists of two phases. The first phase employs a conversational approach that guides the LLM to generate appropriate specifications for a given program. The second phase, designed for where the LLM fails to generate correct specifications, applies four mutation operators to the model-generated specifications and selects verifiable specifications from the mutated ones through a novel heuristic selection strategy. We evaluate SpecGen on two datasets, including the SV-COMP Java category benchmark and a manually constructed dataset. Experimental results demonstrate that SpecGen succeeds in generating verifiable specifications for 279 out of 385 programs, outperforming the existing purely LLM-based approaches and conventional specification generation tools like Houdini and Daikon. Further investigations on the quality of generated specifications indicate that SpecGen can comprehensively articulate the behaviors of the input program.

翻译：形式化程序规约在软件开发的各个阶段都起着至关重要的作用。然而，手动编写形式化程序规约相当困难，导致这项工作耗时耗力。若要编写出正确且全面描述复杂程序语义的规约，则更具挑战性。为减轻软件开发人员的负担，自动化规约生成方法应运而生。然而，现有方法通常依赖于预定义的模板或语法，难以准确描述复杂实际程序的行为和功能。为应对这一挑战，我们提出了SpecGen，一种基于大语言模型的形式化程序规约生成新技术。我们的核心思路是通过利用大语言模型的代码理解能力来克服现有方法的局限性。SpecGen的生成过程包含两个阶段：第一阶段采用对话式方法，引导大语言模型为给定程序生成合适的规约；第二阶段针对大语言模型未能生成正确规约的情况，对模型生成的规约应用四种变异算子，并通过一种新颖的启发式选择策略从变异后的规约中筛选出可验证的规约。我们在两个数据集上评估了SpecGen，包括SV-COMP Java类别基准测试集和一个人工构建的数据集。实验结果表明，SpecGen成功为385个程序中的279个生成了可验证的规约，其性能优于现有的纯大语言模型方法以及Houdini、Daikon等传统规约生成工具。对生成规约质量的进一步分析表明，SpecGen能够全面清晰地阐述输入程序的行为。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日