This paper presents LProtector, an automated vulnerability detection system for C/C++ codebases driven by the large language model (LLM) GPT-4o and Retrieval-Augmented Generation (RAG). As software complexity grows, traditional methods face challenges in detecting vulnerabilities effectively. LProtector leverages GPT-4o's powerful code comprehension and generation capabilities to perform binary classification and identify vulnerabilities within target codebases. We conducted experiments on the Big-Vul dataset, showing that LProtector outperforms two state-of-the-art baselines in terms of F1 score, demonstrating the potential of integrating LLMs with vulnerability detection.
翻译:本文提出LProtector,一种基于大语言模型GPT-4o与检索增强生成技术驱动的C/C++代码库自动化漏洞检测系统。随着软件复杂度的增长,传统方法在有效检测漏洞方面面临挑战。LProtector利用GPT-4o强大的代码理解与生成能力,对目标代码库执行二分类并识别漏洞。我们在Big-Vul数据集上进行了实验,结果表明LProtector在F1分数上优于两种最先进的基线方法,这证明了大语言模型与漏洞检测技术融合的潜力。