AI Agentic Vulnerability Injection And Transformation with Optimized Reasoning

The increasing complexity of software systems and the sophistication of cyber-attacks have underscored the need for reliable automated software vulnerability detection. Data-driven approaches using deep learning models show promise but critically depend on the availability of large, accurately labeled datasets. Yet existing datasets either suffer from noisy labels, limited vulnerability coverage, or fail to reflect vulnerabilities as they occur in real-world software. This also limits large-scale benchmarking of such solutions. Automated vulnerability injection provides a way to address these limitations, but existing techniques remain limited in coverage, contextual fidelity, or injection success. In this paper, we present AVIATOR, the first AI-agentic vulnerability injection framework. AVIATOR decomposes vulnerability injection into a coordinated workflow of specialized AI agents, tool-based analysis, and iterative self-correction, explicitly mirroring expert reasoning. It integrates RAG and lightweight LoRA-based fine-tuning to produce realistic, category-specific vulnerabilities without relying on handcrafted patterns. Across three benchmarks, AVIATOR achieves high injection fidelity (91-95%) surpassing existing injection techniques in both accuracy and vulnerability coverage. When used for data augmentation to train deep learning-based vulnerability detection (DLVD) models, AVIATOR provides the strongest downstream gains in vulnerability detection. Across models and base datasets, AVIATOR improves average F1 scores by +22% over no augmentation, +25% over VGX, holding the prior best injection success rate, and +3% over VulScribeR, the prior state-of-the-art LLM-based injection model, with +7% higher recall and no precision loss. Its augmented data exhibits the lowest distributional distortion and scales efficiently with <2% syntax rejection at 4.3x lower cost than VulScribeR.

翻译：软件系统日益增长的复杂性与网络攻击手段的不断精进，凸显了可靠自动化软件漏洞检测的必要性。基于深度学习模型的数据驱动方法展现出潜力，但其效果关键依赖于大规模、标注准确的数据集的可用性。然而，现有数据集普遍存在标签噪声、漏洞覆盖范围有限或未能真实反映现实软件中漏洞形态等问题。这也限制了对此类解决方案进行大规模基准测试的能力。自动化漏洞注入为解决这些局限提供了途径，但现有技术在覆盖范围、上下文保真度或注入成功率方面仍存在不足。本文提出了AVIATOR，首个AI智能体驱动的漏洞注入框架。AVIATOR将漏洞注入分解为由专用AI智能体、基于工具的分析和迭代式自我修正组成的协同工作流，明确地模拟了专家推理过程。该框架集成了RAG和基于轻量级LoRA的微调技术，无需依赖人工编写的模式即可生成真实、特定类别的漏洞。在三个基准测试中，AVIATOR实现了高注入保真度（91-95%），在准确性和漏洞覆盖范围上均超越了现有注入技术。当用于数据增强以训练基于深度学习的漏洞检测（DLVD）模型时，AVIATOR带来了最强的下游漏洞检测性能提升。在不同模型和基础数据集上，与无增强相比，AVIATOR将平均F1分数提高了+22%；相较于此前保持最佳注入成功率的VGX，提高了+25%；相较于此前最先进的基于LLM的注入模型VulScribeR，提高了+3%，同时召回率提升+7%且无精度损失。其增强数据表现出最低的分布失真，并能高效扩展，语法拒绝率低于2%，成本比VulScribeR低4.3倍。

相关内容

关注 7110

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

专知会员服务

25+阅读 · 2025年8月7日

【博士论文】大规模人工智能中的强化学习智能体：高效训练与更严谨分析

专知会员服务

20+阅读 · 2025年7月1日

深度学习中的数据投毒：综述

专知会员服务

30+阅读 · 2025年4月1日

DeepSeek专题研究：“低成本、高性能、强推理”三位一体，DeepSeek驱动高质量模型平价化

专知会员服务

80+阅读 · 2025年2月14日