Unit Test Generation for Vulnerability Exploitation in Java Third-Party Libraries

Open-source third-party libraries are widely used in software development. These libraries offer substantial advantages in terms of time and resource savings. However, a significant concern arises due to the publicly disclosed vulnerabilities within these libraries. Existing automated vulnerability detection tools often suffer from false positives and fail to accurately assess the propagation of inputs capable of triggering vulnerabilities from client projects to vulnerable code in libraries. In this paper, we propose a novel approach called VULEUT (Vulnerability Exploit Unit Test Generation), which combines vulnerability exploitation reachability analysis and LLM-based unit test generation. VULEUT is designed to automatically verify the exploitability of vulnerabilities in third-party libraries commonly used in client software projects. VULEUT first analyzes the client projects to determine the reachability of vulnerability conditions. And then, it leverages the Large Language Model (LLM) to generate unit tests for vulnerability confirmation. To evaluate the effectiveness of VULEUT, we collect 32 vulnerabilities from various third-party libraries and conduct experiments on 70 real client projects. Besides, we also compare our approach with two representative tools, i.e., TRANSFER and VESTA. Our results demonstrate the effectiveness of VULEUT, with 229 out of 292 generated unit tests successfully confirming vulnerability exploitation across 70 client projects, which outperforms baselines by 24%.

翻译：开源第三方库在软件开发中被广泛使用。这些库在节省时间和资源方面具有显著优势。然而，这些库中公开披露的漏洞引发了重大关切。现有的自动化漏洞检测工具常存在误报问题，且难以准确评估能够触发漏洞的输入从客户端项目到库中脆弱代码的传播路径。本文提出了一种名为VULEUT（漏洞利用单元测试生成）的新方法，该方法结合了漏洞利用可达性分析和基于大语言模型（LLM）的单元测试生成。VULEUT旨在自动验证客户端软件项目中常用第三方库漏洞的可利用性。VULEUT首先分析客户端项目以确定漏洞条件的可达性，随后利用大语言模型生成用于漏洞确认的单元测试。为评估VULEUT的有效性，我们从多个第三方库中收集了32个漏洞，并在70个真实客户端项目上进行了实验。此外，我们还将该方法与两种代表性工具（即TRANSFER和VESTA）进行了比较。实验结果表明VULEUT具有显著效果：在70个客户端项目中生成的292个单元测试中，有229个成功确认了漏洞利用，其性能超越基线方法24%。

相关内容

TOOLS

关注 1

这个新版本的工具会议系列恢复了从1989年到2012年的50个会议的传统。工具最初是“面向对象语言和系统的技术”，后来发展到包括软件技术的所有创新方面。今天许多最重要的软件概念都是在这里首次引入的。2019年TOOLS 50+1在俄罗斯喀山附近举行，以同样的创新精神、对所有与软件相关的事物的热情、科学稳健性和行业适用性的结合以及欢迎该领域所有趋势和社区的开放态度，延续了该系列。官网链接：http://tools2019.innopolis.ru/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日