With the increase in software vulnerabilities that cause significant economic and social losses, automatic vulnerability detection has become essential in software development and maintenance. Recently, large language models (LLMs) like GPT have received considerable attention due to their stunning intelligence, and some studies consider using ChatGPT for vulnerability detection. However, they do not fully consider the characteristics of LLMs, since their designed questions to ChatGPT are simple without a specific prompt design tailored for vulnerability detection. This paper launches a study on the performance of software vulnerability detection using ChatGPT with different prompt designs. Firstly, we complement previous work by applying various improvements to the basic prompt. Moreover, we incorporate structural and sequential auxiliary information to improve the prompt design. Besides, we leverage ChatGPT's ability of memorizing multi-round dialogue to design suitable prompts for vulnerability detection. We conduct extensive experiments on two vulnerability datasets to demonstrate the effectiveness of prompt-enhanced vulnerability detection using ChatGPT. We also analyze the merit and demerit of using ChatGPT for vulnerability detection.
翻译:随着软件漏洞导致重大经济和社会损失的增加,自动漏洞检测在软件开发和维护中变得至关重要。近年来,像GPT这样的大型语言模型因其惊人的智能而受到广泛关注,一些研究考虑使用ChatGPT进行漏洞检测。然而,这些研究并未充分考虑大型语言模型的特性,因为它们为ChatGPT设计的问题较为简单,缺乏专门针对漏洞检测的提示设计。本文针对使用ChatGPT进行软件漏洞检测的性能展开了研究,重点探讨不同提示设计的影响。首先,我们通过对基础提示应用多种改进来补充先前的工作。此外,我们结合了结构和序列辅助信息以优化提示设计。同时,我们利用ChatGPT记忆多轮对话的能力,设计适用于漏洞检测的提示。我们在两个漏洞数据集上进行了大量实验,以证明使用ChatGPT进行提示增强型漏洞检测的有效性。我们还分析了使用ChatGPT进行漏洞检测的优点与不足。