Leveraging Large Language Models for Automated Proof Synthesis in Rust

Formal verification can provably guarantee the correctness of critical system software, but the high proof burden has long hindered its wide adoption. Recently, Large Language Models (LLMs) have shown success in code analysis and synthesis. In this paper, we present a combination of LLMs and static analysis to synthesize invariants, assertions, and other proof structures for a Rust-based formal verification framework called Verus. In a few-shot setting, LLMs demonstrate impressive logical ability in generating postconditions and loop invariants, especially when analyzing short code snippets. However, LLMs lack the ability to retain and propagate context information, a strength of traditional static analysis. Based on these observations, we developed a prototype based on OpenAI's GPT-4 model. Our prototype decomposes the verification task into multiple smaller ones, iteratively queries GPT-4, and combines its output with lightweight static analysis. We evaluated the prototype with a developer in the automation loop on 20 vector-manipulating programs. The results demonstrate that it significantly reduces human effort in writing entry-level proof code.

翻译：形式化验证可以可证明地保证关键系统软件的正确性，但长期以来高昂的证明负担阻碍了其广泛采用。近年来，大型语言模型在代码分析和合成方面展现出了成功应用。本文提出了一种结合大型语言模型和静态分析的方法，为名为Verus的基于Rust的形式化验证框架合成不变量、断言及其他证明结构。在少样本场景下，大型语言模型在生成后置条件和循环不变量方面展现出令人印象深刻的逻辑能力，尤其在分析短代码片段时表现突出。然而，大型语言模型缺乏保留和传播上下文信息的能力，而这正是传统静态分析的优势所在。基于这些观察，我们开发了一个基于OpenAI GPT-4模型的原型系统。该原型将验证任务分解为多个子任务，迭代查询GPT-4，并将其输出与轻量级静态分析相结合。我们通过一位开发者在自动化循环中对该原型进行了评估，测试对象为20个向量操作程序。结果表明，该方法显著减少了编写入门级证明代码所需的人工工作量。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日