Despite various approaches being employed to detect vulnerabilities, the number of reported vulnerabilities shows an upward trend over the years. This suggests the problems are not caught before the code is released, which could be caused by many factors, like lack of awareness, limited efficacy of the existing vulnerability detection tools or the tools not being user-friendly. To help combat some issues with traditional vulnerability detection tools, we propose using large language models (LLMs) to assist in finding vulnerabilities in source code. LLMs have shown a remarkable ability to understand and generate code, underlining their potential in code-related tasks. The aim is to test multiple state-of-the-art LLMs and identify the best prompting strategies, allowing extraction of the best value from the LLMs. We provide an overview of the strengths and weaknesses of the LLM-based approach and compare the results to those of traditional static analysis tools. We find that LLMs can pinpoint many more issues than traditional static analysis tools, outperforming traditional tools in terms of recall and F1 scores. The results should benefit software developers and security analysts responsible for ensuring that the code is free of vulnerabilities.
翻译:尽管已采用多种方法来检测漏洞,但报告漏洞的数量近年来呈上升趋势。这表明问题在代码发布前未被发现,可能由多种因素导致,如安全意识不足、现有漏洞检测工具效能有限或工具用户体验不佳。为应对传统漏洞检测工具的某些问题,我们提出使用大型语言模型(LLMs)辅助检测源代码中的漏洞。LLMs在理解和生成代码方面展现出卓越能力,凸显了其在代码相关任务中的潜力。本研究旨在测试多种前沿LLMs并确定最佳提示策略,从而最大化LLMs的效能。我们系统分析了基于LLM方法的优势与局限,并将其结果与传统静态分析工具进行对比。研究发现,LLMs能够识别出比传统静态分析工具更多的问题,在召回率和F1分数方面均优于传统工具。该研究成果将有助于负责确保代码无漏洞的软件开发人员和安全分析师。