CRepair: CVAE-based Automatic Vulnerability Repair Technology

Software vulnerabilities are flaws in computer software systems that pose significant threats to the integrity, security, and reliability of modern software and its application data. These vulnerabilities can lead to substantial economic losses across various industries. Manual vulnerability repair is not only time-consuming but also prone to errors. To address the challenges of vulnerability repair, researchers have proposed various solutions, with learning-based automatic vulnerability repair techniques gaining widespread attention. However, existing methods often focus on learning more vulnerability data to improve repair outcomes, while neglecting the diverse characteristics of vulnerable code, and suffer from imprecise vulnerability localization.To address these shortcomings, this paper proposes CRepair, a CVAE-based automatic vulnerability repair technology aimed at fixing security vulnerabilities in system code. We first preprocess the vulnerability data using a prompt-based method to serve as input to the model. Then, we apply causal inference techniques to map the vulnerability feature data to probability distributions. By employing multi-sample feature fusion, we capture diverse vulnerability feature information. Finally, conditional control is used to guide the model in repairing the vulnerabilities.Experimental results demonstrate that the proposed method significantly outperforms other benchmark models, achieving a perfect repair rate of 52%. The effectiveness of the approach is validated from multiple perspectives, advancing AI-driven code vulnerability repair and showing promising applications.

翻译：软件漏洞是计算机软件系统中的缺陷，对现代软件及其应用数据的完整性、安全性和可靠性构成重大威胁。这些漏洞可能导致各行业遭受巨大的经济损失。手动漏洞修复不仅耗时，而且容易出错。为应对漏洞修复的挑战，研究人员提出了多种解决方案，其中基于学习的自动化漏洞修复技术获得了广泛关注。然而，现有方法通常侧重于学习更多漏洞数据以改善修复效果，却忽视了漏洞代码的多样性特征，且存在漏洞定位不精确的问题。针对这些不足，本文提出了CRepair，一种基于CVAE的自动化漏洞修复技术，旨在修复系统代码中的安全漏洞。我们首先使用基于提示的方法对漏洞数据进行预处理，作为模型的输入。然后，应用因果推断技术将漏洞特征数据映射到概率分布。通过采用多样本特征融合，我们捕获了多样化的漏洞特征信息。最后，利用条件控制引导模型修复漏洞。实验结果表明，所提方法显著优于其他基准模型，实现了52%的完美修复率。该方法的有效性从多个角度得到了验证，推动了人工智能驱动的代码漏洞修复研究，并展现出良好的应用前景。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日