天作之合？AI驱动的漏洞与安全单元测试匹配 (A Match Made in Heaven? AI-driven Matching of Vulnerabilities and Security Unit Tests) - 专知论文

会员服务 ·

0

单元 · AI · 测试用例 · 示例 · 数据集 ·

A Match Made in Heaven? AI-driven Matching of Vulnerabilities and Security Unit Tests

翻译：天作之合？AI驱动的漏洞与安全单元测试匹配

Emanuele Iannone,Quang-Cuong Bui,Riccardo Scandariato

from arxiv, Accepted in the MSR 2026 Technical Track. This work was partially supported by EU-funded project Sec4AI4Sec (grant no. 101120393)

Software vulnerabilities are often detected via taint analysis, penetration testing, or fuzzing. They are also found via unit tests that exercise security-sensitive behavior with specific inputs, called vulnerability-witnessing tests. Generative AI models could help developers in writing them, but they require many examples to learn from, which are currently scarce. This paper introduces VuTeCo, an AI-driven framework for collecting examples of vulnerability-witnessing tests from Java repositories. VuTeCo carries out two tasks: (1) The "Finding" task to determine whether a unit test case is security-related, and (2) the "Matching" task to relate a test case to the vulnerability it witnesses. VuTeCo addresses the Finding task with UniXcoder, achieving an F0.5 score of 0.73 and a precision of 0.83 on a test set of unit tests from Vul4J. The Matching task is addressed using DeepSeek Coder, achieving an F0.5 score of 0.65 and a precision of 0.75 on a test set of pairs of unit tests and vulnerabilities from Vul4J. VuTeCo has been used in the wild on 427 Java projects and 1,238 vulnerabilities, obtaining 224 test cases confirmed to be security-related and 35 tests correctly matched to 29 vulnerabilities. The validated tests were collected in a new dataset called Test4Vul. VuTeCo lays the foundation for large-scale retrieval of vulnerability-witnessing tests, enabling future AI models to better understand and generate security unit tests.

翻译：软件漏洞通常通过污点分析、渗透测试或模糊测试来检测。它们也可以通过单元测试来发现，这些测试使用特定输入执行安全敏感行为，称为漏洞见证测试。生成式AI模型可以帮助开发者编写此类测试，但它们需要大量示例进行学习，而目前此类示例稀缺。本文介绍了VuTeCo，一个从Java代码库中收集漏洞见证测试示例的AI驱动框架。VuTeCo执行两项任务：(1) "发现"任务，用于判定单元测试用例是否与安全相关；(2) "匹配"任务，将测试用例与其所见证的漏洞相关联。VuTeCo使用UniXcoder处理发现任务，在Vul4J的单元测试数据集上取得了F0.5分数0.73和精确率0.83。匹配任务采用DeepSeek Coder实现，在Vul4J的测试用例与漏洞配对数据集上取得了F0.5分数0.65和精确率0.75。VuTeCo已在427个Java项目和1,238个漏洞的实际环境中应用，获得了224个经确认与安全相关的测试用例，以及35个正确匹配到29个漏洞的测试。经验证的测试用例被收集到名为Test4Vul的新数据集中。VuTeCo为大规模检索漏洞见证测试奠定了基础，使未来的AI模型能够更好地理解和生成安全单元测试。

0

相关内容

《大语言模型驱动的智能红队测试》

《大语言模型驱动的智能红队测试》

专知会员服务

16+阅读 · 2025年11月26日

【新书】使用生成式人工智能进行软件测试

【新书】使用生成式人工智能进行软件测试

专知会员服务

44+阅读 · 2025年1月6日

【新书】利用OpenAI API构建 AI应用：利用ChatGPT、Whisper和DALL-E API 构建10个创新AI项目

【新书】利用OpenAI API构建 AI应用：利用ChatGPT、Whisper和DALL-E API 构建10个创新AI项目

专知会员服务

39+阅读 · 2024年12月3日

【新书】利用生成式人工智能进行网络防御策略

【新书】利用生成式人工智能进行网络防御策略

专知会员服务

31+阅读 · 2024年10月18日

【新书】AI驱动的开发者：使用ChatGPT和Copilot构建出色的软件

【新书】AI驱动的开发者：使用ChatGPT和Copilot构建出色的软件

专知会员服务

48+阅读 · 2024年9月23日

《理解、评估和缓解人工智能系统中的安全风险》美海军67页论文

《理解、评估和缓解人工智能系统中的安全风险》美海军67页论文

专知会员服务

52+阅读 · 2023年3月25日

《H4rm0ny：用于规避恶意软件生成和检测的多智能体学习的竞争性两人零和马尔可夫博弈》2022最新12页论文，加拿大国防研究与发展部

《H4rm0ny：用于规避恶意软件生成和检测的多智能体学习的竞争性两人零和马尔可夫博弈》2022最新12页论文，加拿大国防研究与发展部

专知会员服务

26+阅读 · 2022年10月26日

【书籍】网络安全《移动目标防御 II：博弈论和对抗性建模的应用》210页，Moving Target Defense II：Application of Game Theory and Adversarial Modeling

【书籍】网络安全《移动目标防御 II：博弈论和对抗性建模的应用》210页，Moving Target Defense II：Application of Game Theory and Adversarial Modeling

专知会员服务

66+阅读 · 2022年4月14日

【AI系统安全】《对抗性（攻防）机器学习的系统方法》，42页pdf

【AI系统安全】《对抗性（攻防）机器学习的系统方法》，42页pdf

专知会员服务

44+阅读 · 2022年3月25日

【O'Reilly TensorFlow Conference 2019】恶意软件检测（Generative malware outbreak detection），Sean Park | Trend Micro

【O'Reilly TensorFlow Conference 2019】恶意软件检测（Generative malware outbreak detection），Sean Park | Trend Micro

专知会员服务

15+阅读 · 2019年11月13日

【AI+军事】《用于威胁评估的人工智能工具》加拿大国防研究和发展部技术报告，附中文版pdf

【AI+军事】《用于威胁评估的人工智能工具》加拿大国防研究和发展部技术报告，附中文版pdf

专知

90+阅读 · 2022年4月17日

【经典书】网络安全《移动目标防御 II：博弈论和对抗性建模的应用》210页

【经典书】网络安全《移动目标防御 II：博弈论和对抗性建模的应用》210页

专知

17+阅读 · 2022年4月16日

《人工智能安全测评白皮书》，99页pdf

《人工智能安全测评白皮书》，99页pdf

专知

36+阅读 · 2022年2月26日

YOLOv5在建筑工地中安全帽佩戴检测的应用（已开源+数据集）

YOLOv5在建筑工地中安全帽佩戴检测的应用（已开源+数据集）

计算机视觉life

24+阅读 · 2020年11月10日

Xsser 一款自动检测XSS漏洞工具

Xsser 一款自动检测XSS漏洞工具

黑白之道

14+阅读 · 2019年8月26日

Web渗透测试Fuzz字典分享

Web渗透测试Fuzz字典分享

黑白之道

21+阅读 · 2019年5月22日

中科院自动化所提出 BIFT 模型：面向自然语言生成，同步双向推断

中科院自动化所提出 BIFT 模型：面向自然语言生成，同步双向推断

AI科技评论

12+阅读 · 2019年5月2日

百度提出ERNIE，多项中文NLP任务表现出色（已开源）

百度提出ERNIE，多项中文NLP任务表现出色（已开源）

AI100

33+阅读 · 2019年3月16日

爱奇艺基于AI的移动端自动化测试框架的设计

爱奇艺基于AI的移动端自动化测试框架的设计

前端之巅

18+阅读 · 2019年2月27日

DeepMind无监督表示学习重大突破：语音、图像、文本、强化学习全能冠军！

DeepMind无监督表示学习重大突破：语音、图像、文本、强化学习全能冠军！

新智元

12+阅读 · 2018年7月13日

基于智能模糊测试的深度漏洞挖掘技术研究

国家自然科学基金

4+阅读 · 2017年12月31日

基于学习的智能化漏洞挖掘关键技术研究

国家自然科学基金

6+阅读 · 2017年12月31日

面向应用商店的移动智能终端恶意软件检测关键技术研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于海量软件片段比对的恶意代码检测方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

面向Bug报告的软件故障重现方法研究

国家自然科学基金

4+阅读 · 2015年12月31日

基于网络活动分析的窃密木马检测技术研究

国家自然科学基金

0+阅读 · 2015年12月31日

复杂需求场景驱动的软件安全防护模型检测技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

Android移动终端多语种基础软件组合的安全技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于自适应模型检测的安全协议自动建模与设计研究

国家自然科学基金

1+阅读 · 2014年12月31日

支持软件可信演化的故障定位研究

国家自然科学基金

0+阅读 · 2014年12月31日

A Dual-Loop Agent Framework for Automated Vulnerability Reproduction

Arxiv

0+阅读 · 2月5日

Co-RedTeam: Orchestrated Security Discovery and Exploitation with LLM Agents

Arxiv

0+阅读 · 2月3日

Co-RedTeam: Orchestrated Security Discovery and Exploitation with LLM Agents

Arxiv

0+阅读 · 2月2日

From Detection to Prevention: Explaining Security-Critical Code to Avoid Vulnerabilities

Arxiv

0+阅读 · 1月31日

VulnResolver: A Hybrid Agent Framework for LLM-Based Automated Vulnerability Issue Resolution

Arxiv

0+阅读 · 1月20日

AI-Based Vulnerability Analysis of NFT Smart Contracts

Arxiv

0+阅读 · 1月17日

AI Agent Smart Contract Exploit Generation

Arxiv

0+阅读 · 1月12日

PenForge: On-the-Fly Expert Agent Construction for Automated Penetration Testing

Arxiv

0+阅读 · 1月11日

AI-Powered Algorithms for the Prevention and Detection of Computer Malware Infections

Arxiv

0+阅读 · 1月9日

A Match Made in Heaven? AI-driven Matching of Vulnerabilities and Security Unit Tests

A Match Made in Heaven? AI-driven Matching of Vulnerabilities and Security Unit Tests

Arxiv

0+阅读 · 1月8日

VIP会员

文章信息

相关主题

相关VIP内容

《大语言模型驱动的智能红队测试》

《大语言模型驱动的智能红队测试》

专知会员服务

16+阅读 · 2025年11月26日

【新书】使用生成式人工智能进行软件测试

【新书】使用生成式人工智能进行软件测试

专知会员服务

44+阅读 · 2025年1月6日

【新书】利用OpenAI API构建 AI应用：利用ChatGPT、Whisper和DALL-E API 构建10个创新AI项目

【新书】利用OpenAI API构建 AI应用：利用ChatGPT、Whisper和DALL-E API 构建10个创新AI项目

专知会员服务

39+阅读 · 2024年12月3日

【新书】利用生成式人工智能进行网络防御策略

【新书】利用生成式人工智能进行网络防御策略

专知会员服务

31+阅读 · 2024年10月18日

【新书】AI驱动的开发者：使用ChatGPT和Copilot构建出色的软件

【新书】AI驱动的开发者：使用ChatGPT和Copilot构建出色的软件

专知会员服务

48+阅读 · 2024年9月23日

《理解、评估和缓解人工智能系统中的安全风险》美海军67页论文

《理解、评估和缓解人工智能系统中的安全风险》美海军67页论文

专知会员服务

52+阅读 · 2023年3月25日

《H4rm0ny：用于规避恶意软件生成和检测的多智能体学习的竞争性两人零和马尔可夫博弈》2022最新12页论文，加拿大国防研究与发展部

《H4rm0ny：用于规避恶意软件生成和检测的多智能体学习的竞争性两人零和马尔可夫博弈》2022最新12页论文，加拿大国防研究与发展部

专知会员服务

26+阅读 · 2022年10月26日

【书籍】网络安全《移动目标防御 II：博弈论和对抗性建模的应用》210页，Moving Target Defense II：Application of Game Theory and Adversarial Modeling

【书籍】网络安全《移动目标防御 II：博弈论和对抗性建模的应用》210页，Moving Target Defense II：Application of Game Theory and Adversarial Modeling

专知会员服务

66+阅读 · 2022年4月14日

【AI系统安全】《对抗性（攻防）机器学习的系统方法》，42页pdf

【AI系统安全】《对抗性（攻防）机器学习的系统方法》，42页pdf

专知会员服务

44+阅读 · 2022年3月25日

【O'Reilly TensorFlow Conference 2019】恶意软件检测（Generative malware outbreak detection），Sean Park | Trend Micro

【O'Reilly TensorFlow Conference 2019】恶意软件检测（Generative malware outbreak detection），Sean Park | Trend Micro

专知会员服务

15+阅读 · 2019年11月13日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】基于自适应表征的高效视觉建模

《多域作战中融合网络、电子战与动能机动》

AI智能体时代大模型安全风险与攻防新挑战

迈向个性化大语言模型驱动的智能体：基础、评估与未来方向

相关资讯

【AI+军事】《用于威胁评估的人工智能工具》加拿大国防研究和发展部技术报告，附中文版pdf

【AI+军事】《用于威胁评估的人工智能工具》加拿大国防研究和发展部技术报告，附中文版pdf

专知

90+阅读 · 2022年4月17日

【经典书】网络安全《移动目标防御 II：博弈论和对抗性建模的应用》210页

【经典书】网络安全《移动目标防御 II：博弈论和对抗性建模的应用》210页

专知

17+阅读 · 2022年4月16日

《人工智能安全测评白皮书》，99页pdf

《人工智能安全测评白皮书》，99页pdf

专知

36+阅读 · 2022年2月26日

YOLOv5在建筑工地中安全帽佩戴检测的应用（已开源+数据集）

YOLOv5在建筑工地中安全帽佩戴检测的应用（已开源+数据集）

计算机视觉life

24+阅读 · 2020年11月10日

Xsser 一款自动检测XSS漏洞工具

Xsser 一款自动检测XSS漏洞工具

黑白之道

14+阅读 · 2019年8月26日

Web渗透测试Fuzz字典分享

Web渗透测试Fuzz字典分享

黑白之道

21+阅读 · 2019年5月22日

中科院自动化所提出 BIFT 模型：面向自然语言生成，同步双向推断

中科院自动化所提出 BIFT 模型：面向自然语言生成，同步双向推断

AI科技评论

12+阅读 · 2019年5月2日

百度提出ERNIE，多项中文NLP任务表现出色（已开源）

百度提出ERNIE，多项中文NLP任务表现出色（已开源）

AI100

33+阅读 · 2019年3月16日

爱奇艺基于AI的移动端自动化测试框架的设计

爱奇艺基于AI的移动端自动化测试框架的设计

前端之巅

18+阅读 · 2019年2月27日

DeepMind无监督表示学习重大突破：语音、图像、文本、强化学习全能冠军！

DeepMind无监督表示学习重大突破：语音、图像、文本、强化学习全能冠军！

新智元

12+阅读 · 2018年7月13日

相关论文

A Dual-Loop Agent Framework for Automated Vulnerability Reproduction

Arxiv

0+阅读 · 2月5日

Co-RedTeam: Orchestrated Security Discovery and Exploitation with LLM Agents

Arxiv

0+阅读 · 2月3日

Co-RedTeam: Orchestrated Security Discovery and Exploitation with LLM Agents

Arxiv

0+阅读 · 2月2日

From Detection to Prevention: Explaining Security-Critical Code to Avoid Vulnerabilities

Arxiv

0+阅读 · 1月31日

VulnResolver: A Hybrid Agent Framework for LLM-Based Automated Vulnerability Issue Resolution

Arxiv

0+阅读 · 1月20日

AI-Based Vulnerability Analysis of NFT Smart Contracts

Arxiv

0+阅读 · 1月17日

AI Agent Smart Contract Exploit Generation

Arxiv

0+阅读 · 1月12日

PenForge: On-the-Fly Expert Agent Construction for Automated Penetration Testing

Arxiv

0+阅读 · 1月11日

AI-Powered Algorithms for the Prevention and Detection of Computer Malware Infections

Arxiv

0+阅读 · 1月9日

A Match Made in Heaven? AI-driven Matching of Vulnerabilities and Security Unit Tests

A Match Made in Heaven? AI-driven Matching of Vulnerabilities and Security Unit Tests

Arxiv

0+阅读 · 1月8日

相关基金

基于智能模糊测试的深度漏洞挖掘技术研究

国家自然科学基金

4+阅读 · 2017年12月31日

基于学习的智能化漏洞挖掘关键技术研究

国家自然科学基金

6+阅读 · 2017年12月31日

面向应用商店的移动智能终端恶意软件检测关键技术研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于海量软件片段比对的恶意代码检测方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

面向Bug报告的软件故障重现方法研究

国家自然科学基金

4+阅读 · 2015年12月31日

基于网络活动分析的窃密木马检测技术研究

国家自然科学基金

0+阅读 · 2015年12月31日

复杂需求场景驱动的软件安全防护模型检测技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

Android移动终端多语种基础软件组合的安全技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于自适应模型检测的安全协议自动建模与设计研究

国家自然科学基金

1+阅读 · 2014年12月31日

支持软件可信演化的故障定位研究

国家自然科学基金

0+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员