Open-source third-party libraries are widely used in software development. These libraries offer substantial advantages in terms of time and resource savings. However, a significant concern arises due to the publicly disclosed vulnerabilities within these libraries. Existing automated vulnerability detection tools often suffer from false positives and fail to accurately assess the propagation of inputs capable of triggering vulnerabilities from client projects to vulnerable code in libraries. In this paper, we propose a novel approach called VULEUT (Vulnerability Exploit Unit Test Generation), which combines vulnerability exploitation reachability analysis and LLM-based unit test generation. VULEUT is designed to automatically verify the exploitability of vulnerabilities in third-party libraries commonly used in client software projects. VULEUT first analyzes the client projects to determine the reachability of vulnerability conditions. And then, it leverages the Large Language Model (LLM) to generate unit tests for vulnerability confirmation. To evaluate the effectiveness of VULEUT, we collect 32 vulnerabilities from various third-party libraries and conduct experiments on 70 real client projects. Besides, we also compare our approach with two representative tools, i.e., TRANSFER and VESTA. Our results demonstrate the effectiveness of VULEUT, with 229 out of 292 generated unit tests successfully confirming vulnerability exploitation across 70 client projects, which outperforms baselines by 24%.
翻译:开源第三方库在软件开发中被广泛使用。这些库在节省时间和资源方面具有显著优势。然而,这些库中公开披露的漏洞引发了重大关切。现有的自动化漏洞检测工具常存在误报问题,且难以准确评估能够触发漏洞的输入从客户端项目到库中脆弱代码的传播路径。本文提出了一种名为VULEUT(漏洞利用单元测试生成)的新方法,该方法结合了漏洞利用可达性分析和基于大语言模型(LLM)的单元测试生成。VULEUT旨在自动验证客户端软件项目中常用第三方库漏洞的可利用性。VULEUT首先分析客户端项目以确定漏洞条件的可达性,随后利用大语言模型生成用于漏洞确认的单元测试。为评估VULEUT的有效性,我们从多个第三方库中收集了32个漏洞,并在70个真实客户端项目上进行了实验。此外,我们还将该方法与两种代表性工具(即TRANSFER和VESTA)进行了比较。实验结果表明VULEUT具有显著效果:在70个客户端项目中生成的292个单元测试中,有229个成功确认了漏洞利用,其性能超越基线方法24%。