In this paper we present a novel solution that combines the capabilities of Large Language Models (LLMs) with Formal Verification strategies to verify and automatically repair software vulnerabilities. Initially, we employ Bounded Model Checking (BMC) to locate the software vulnerability and derive a counterexample. The counterexample provides evidence that the system behaves incorrectly or contains a vulnerability. The counterexample that has been detected, along with the source code, are provided to the LLM engine. Our approach involves establishing a specialized prompt language for conducting code debugging and generation to understand the vulnerability's root cause and repair the code. Finally, we use BMC to verify the corrected version of the code generated by the LLM. As a proof of concept, we create ESBMC-AI based on the Efficient SMT-based Context-Bounded Model Checker (ESBMC) and a pre-trained Transformer model, specifically gpt-3.5-turbo, to detect and fix errors in C programs. Our experimentation involved generating a dataset comprising 1000 C code samples, each consisting of 20 to 50 lines of code. Notably, our proposed method achieved an impressive success rate of up to 80% in repairing vulnerable code encompassing buffer overflow and pointer dereference failures. We assert that this automated approach can effectively incorporate into the software development lifecycle's continuous integration and deployment (CI/CD) process.
翻译:本文提出了一种融合大语言模型(LLMs)与形式化验证策略的创新方案,用于自动化检测与修复软件漏洞。首先,我们采用有界模型检验(BMC)定位漏洞并生成反例——该反例可证明系统存在异常行为或安全缺陷。将检测到的反例与源代码共同输入LLM引擎后,我们设计了一种专门用于代码调试与生成的提示语言,以解析漏洞根本原因并修复代码。最后,通过BMC验证LLM生成的修正版本代码的准确性。作为概念验证,我们基于高效SMT上下文有界模型检验器(ESBMC)与预训练Transformer模型(具体为gpt-3.5-turbo)构建了ESBMC-AI系统,用于检测与修复C程序中的错误。实验中我们生成了包含1000个C代码样本的数据集(每个样本代码量20-50行)。值得注意的是,该方法对包含缓冲区溢出与指针解引用故障的漏洞代码,实现了最高80%的修复成功率。我们主张该自动化方法可有效整合至软件开发生命周期的持续集成/持续部署(CI/CD)流程中。