How Effective Are Neural Networks for Fixing Security Vulnerabilities - 专知论文

会员服务 ·

0

变换 · MoDELS · Automator · Java · 代码 ·

2023 年 5 月 29 日

How Effective Are Neural Networks for Fixing Security Vulnerabilities

翻译：神经网络在修复安全漏洞方面的有效性如何

Yi Wu,Nan Jiang,Hung Viet Pham,Thibaud Lutellier,Jordan Davis,Lin Tan,Petr Babkin,Sameena Shah

from arxiv, This paper has been accepted to appear in the proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2023), and to be presented at the conference, that will be held in Seattle, USA, 17-21 July 2023

Security vulnerability repair is a difficult task that is in dire need of automation. Two groups of techniques have shown promise: (1) large code language models (LLMs) that have been pre-trained on source code for tasks such as code completion, and (2) automated program repair (APR) techniques that use deep learning (DL) models to automatically fix software bugs. This paper is the first to study and compare Java vulnerability repair capabilities of LLMs and DL-based APR models. The contributions include that we (1) apply and evaluate five LLMs (Codex, CodeGen, CodeT5, PLBART and InCoder), four fine-tuned LLMs, and four DL-based APR techniques on two real-world Java vulnerability benchmarks (Vul4J and VJBench), (2) design code transformations to address the training and test data overlapping threat to Codex, (3) create a new Java vulnerability repair benchmark VJBench, and its transformed version VJBench-trans and (4) evaluate LLMs and APR techniques on the transformed vulnerabilities in VJBench-trans. Our findings include that (1) existing LLMs and APR models fix very few Java vulnerabilities. Codex fixes 10.2 (20.4%), the most number of vulnerabilities. (2) Fine-tuning with general APR data improves LLMs' vulnerability-fixing capabilities. (3) Our new VJBench reveals that LLMs and APR models fail to fix many Common Weakness Enumeration (CWE) types, such as CWE-325 Missing cryptographic step and CWE-444 HTTP request smuggling. (4) Codex still fixes 8.3 transformed vulnerabilities, outperforming all the other LLMs and APR models on transformed vulnerabilities. The results call for innovations to enhance automated Java vulnerability repair such as creating larger vulnerability repair training data, tuning LLMs with such data, and applying code simplification transformation to facilitate vulnerability repair.

翻译：安全漏洞修复是一项亟需自动化的艰巨任务。两类技术展现出前景：(1) 在源代码上预训练用于代码补全等任务的大型代码语言模型（LLMs），以及(2) 使用深度学习（DL）模型自动修复软件缺陷的自动化程序修复（APR）技术。本文首次研究并比较了LLMs与基于DL的APR模型在Java漏洞修复方面的能力。贡献包括：(1) 在两个真实世界的Java漏洞基准（Vul4J和VJBench）上，应用并评估了五个LLMs（Codex、CodeGen、CodeT5、PLBART和InCoder）、四个微调后的LLMs以及四个基于DL的APR技术；(2) 设计代码变换以应对Codex的训练数据与测试数据重叠的威胁；(3) 创建了一个新的Java漏洞修复基准VJBench及其变换版本VJBench-trans；(4) 评估LLMs和APR技术在VJBench-trans中变换后漏洞上的性能。我们的发现包括：(1) 现有的LLMs和APR模型能修复的Java漏洞极少。Codex修复了10.2个（20.4%）漏洞，数量最多。(2) 使用通用APR数据进行微调可提升LLMs的漏洞修复能力。(3) 我们的新VJBench揭示，LLMs和APR模型未能修复许多常见弱点枚举（CWE）类型，例如CWE-325（缺失加密步骤）和CWE-444（HTTP请求走私）。(4) Codex仍然修复了8.3个变换后的漏洞，在所有其他LLMs和APR模型中表现最佳。这些结果呼吁创新以增强自动化Java漏洞修复，例如创建更大的漏洞修复训练数据、使用此类数据微调LLMs，以及应用代码简化变换以促进漏洞修复。

0

相关内容

Effective.Modern.C++ 中英文版，334页pdf

Effective.Modern.C++ 中英文版，334页pdf

专知会员服务

70+阅读 · 2020年11月4日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

37+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

80+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

106+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Purdue电子与计算机工程系李海桐NanoX实验室招收AI硬件全奖博士生（2023秋季）

Purdue电子与计算机工程系李海桐NanoX实验室招收AI硬件全奖博士生（2023秋季）

机器之心

0+阅读 · 2022年10月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

氧化应激状态下受精卵细胞周期停滞的Chk1/Cdc25分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

纳米线构筑的三维网络中温SOFC抗积碳复合阳极的结构与性能研究

国家自然科学基金

0+阅读 · 2014年12月31日

卡宾镍、钯催化剂在共轭聚合物合成中的应用

国家自然科学基金

0+阅读 · 2013年12月31日

适用于无线传感器网络SOC的低功耗低成本SAR型A/D转换器设计研究

国家自然科学基金

0+阅读 · 2013年12月31日

4f和3d电子调控下的新型In和Te基稀土1：3型半导体化合物的磁输运和结构

国家自然科学基金

0+阅读 · 2012年12月31日

干湿交替过程中土壤氧化铁形态转化对As和Sb环境化学行为的影响机制

国家自然科学基金

0+阅读 · 2011年12月31日

Id2在缺血神经损伤中的枢纽调控机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

线粒体：放射性脑损伤的受害者还是施害者？

国家自然科学基金

0+阅读 · 2011年12月31日

新兴污染物HO-PBDEs在水相中的环境光化学活性

国家自然科学基金

0+阅读 · 2009年12月31日

仿射流形上的非线性分析

国家自然科学基金

0+阅读 · 2008年12月31日

CONTRACTFIX: A Framework for Automatically Fixing Vulnerabilities in Smart Contracts

Arxiv

0+阅读 · 2023年7月18日

G-Scan: Graph Neural Networks for Line-Level Vulnerability Identification in Smart Contracts

Arxiv

0+阅读 · 2023年7月17日

A Lightweight Framework for High-Quality Code Generation

Arxiv

0+阅读 · 2023年7月17日

Measurement-Driven Design and Runtime Optimization in Edge Computing: Methodology and Tools

Arxiv

0+阅读 · 2023年7月16日

CPET: Effective Parameter-Efficient Tuning for Compressed Large Language Models

Arxiv

0+阅读 · 2023年7月15日

Mitigating Adversarial Vulnerability through Causal Parameter Estimation by Adversarial Double Machine Learning

Arxiv

0+阅读 · 2023年7月14日

Graph Vulnerability and Robustness: A Survey

Arxiv

10+阅读 · 2022年3月30日

A Review on C3I Systems' Security: Vulnerabilities, Attacks, and Countermeasures

Arxiv

52+阅读 · 2021年4月24日

Scene Text Detection and Recognition: The Deep Learning Era

Scene Text Detection and Recognition: The Deep Learning Era

Arxiv

27+阅读 · 2019年9月5日

GAN Dissection: Visualizing and Understanding Generative Adversarial Networks

GAN Dissection: Visualizing and Understanding Generative Adversarial Networks

Arxiv

11+阅读 · 2018年12月8日

VIP会员

文章信息

相关主题

最新内容

深入Project Maven：为何人工智能在战场上依然失灵

深入Project Maven：为何人工智能在战场上依然失灵

专知会员服务

10+阅读 · 7月19日

锻造未来士兵：外骨骼、基因工程与赛博格

锻造未来士兵：外骨骼、基因工程与赛博格

专知会员服务

5+阅读 · 7月19日

《无人机系统（UAS）通信网状网络试验性部署》50页报告

《无人机系统（UAS）通信网状网络试验性部署》50页报告

专知会员服务

6+阅读 · 7月19日

《无人机蜂群通信技术研究》50页

《无人机蜂群通信技术研究》50页

专知会员服务

7+阅读 · 7月19日

《基于智能体建模与仿真的无人机蜂群模型目标定位涌现行为比较分析》360页

《基于智能体建模与仿真的无人机蜂群模型目标定位涌现行为比较分析》360页

专知会员服务

10+阅读 · 7月18日

欧洲智能弹药战略创新管理：迈向制导弹药、巡飞系统与自主无人机蜂群的技术主权研究路线图

欧洲智能弹药战略创新管理：迈向制导弹药、巡飞系统与自主无人机蜂群的技术主权研究路线图

专知会员服务

8+阅读 · 7月18日

从领域适配到部署与可解释：Berkeley博士论文解析大语言模型真实落地

从领域适配到部署与可解释：Berkeley博士论文解析大语言模型真实落地

专知会员服务

13+阅读 · 7月18日

综述 | 长程智能体研究全景：基础、演化、框架、优化与前沿

综述 | 长程智能体研究全景：基础、演化、框架、优化与前沿

专知会员服务

8+阅读 · 7月18日

DARPA拟打造十万规模自主思考作战的AI智能体集群：“受控涌现式分布式人工智能”（DICE）项目

DARPA拟打造十万规模自主思考作战的AI智能体集群：“受控涌现式分布式人工智能”（DICE）项目

专知会员服务

10+阅读 · 7月17日

《边缘端实时无线感知赋能现场多机器人部署》200页

《边缘端实时无线感知赋能现场多机器人部署》200页

专知会员服务

10+阅读 · 7月17日

战力倍增器：自主武器系统与乌克兰及加沙冲突

战力倍增器：自主武器系统与乌克兰及加沙冲突

专知会员服务

6+阅读 · 7月17日

人工智能赋能战场情报：提速决策进程

人工智能赋能战场情报：提速决策进程

专知会员服务

5+阅读 · 7月17日

《拥抱新兴技术：面向未来军官的教育革新》

《拥抱新兴技术：面向未来军官的教育革新》

专知会员服务

8+阅读 · 7月17日

ACM MM 2026 | MAR-GRPO：稳定混合图像生成的强化学习训练

ACM MM 2026 | MAR-GRPO：稳定混合图像生成的强化学习训练

专知会员服务

6+阅读 · 7月17日

综述 | 大模型水印理论与部署：来源追踪、攻击鲁棒与可信治理

综述 | 大模型水印理论与部署：来源追踪、攻击鲁棒与可信治理

专知会员服务

7+阅读 · 7月17日

相关VIP内容

Effective.Modern.C++ 中英文版，334页pdf

Effective.Modern.C++ 中英文版，334页pdf

专知会员服务

70+阅读 · 2020年11月4日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

37+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

80+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

106+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

锻造未来士兵：外骨骼、基因工程与赛博格

《无人机蜂群通信技术研究》50页

深入Project Maven：为何人工智能在战场上依然失灵

《无人机系统（UAS）通信网状网络试验性部署》50页报告

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Purdue电子与计算机工程系李海桐NanoX实验室招收AI硬件全奖博士生（2023秋季）

Purdue电子与计算机工程系李海桐NanoX实验室招收AI硬件全奖博士生（2023秋季）

机器之心

0+阅读 · 2022年10月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

CONTRACTFIX: A Framework for Automatically Fixing Vulnerabilities in Smart Contracts

Arxiv

0+阅读 · 2023年7月18日

G-Scan: Graph Neural Networks for Line-Level Vulnerability Identification in Smart Contracts

Arxiv

0+阅读 · 2023年7月17日

A Lightweight Framework for High-Quality Code Generation

Arxiv

0+阅读 · 2023年7月17日

Measurement-Driven Design and Runtime Optimization in Edge Computing: Methodology and Tools

Arxiv

0+阅读 · 2023年7月16日

CPET: Effective Parameter-Efficient Tuning for Compressed Large Language Models

Arxiv

0+阅读 · 2023年7月15日

Mitigating Adversarial Vulnerability through Causal Parameter Estimation by Adversarial Double Machine Learning

Arxiv

0+阅读 · 2023年7月14日

Graph Vulnerability and Robustness: A Survey

Arxiv

10+阅读 · 2022年3月30日

A Review on C3I Systems' Security: Vulnerabilities, Attacks, and Countermeasures

Arxiv

52+阅读 · 2021年4月24日

Scene Text Detection and Recognition: The Deep Learning Era

Scene Text Detection and Recognition: The Deep Learning Era

Arxiv

27+阅读 · 2019年9月5日

GAN Dissection: Visualizing and Understanding Generative Adversarial Networks

GAN Dissection: Visualizing and Understanding Generative Adversarial Networks

Arxiv

11+阅读 · 2018年12月8日

相关基金

氧化应激状态下受精卵细胞周期停滞的Chk1/Cdc25分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

纳米线构筑的三维网络中温SOFC抗积碳复合阳极的结构与性能研究

国家自然科学基金

0+阅读 · 2014年12月31日

卡宾镍、钯催化剂在共轭聚合物合成中的应用

国家自然科学基金

0+阅读 · 2013年12月31日

适用于无线传感器网络SOC的低功耗低成本SAR型A/D转换器设计研究

国家自然科学基金

0+阅读 · 2013年12月31日

4f和3d电子调控下的新型In和Te基稀土1：3型半导体化合物的磁输运和结构

国家自然科学基金

0+阅读 · 2012年12月31日

干湿交替过程中土壤氧化铁形态转化对As和Sb环境化学行为的影响机制

国家自然科学基金

0+阅读 · 2011年12月31日

Id2在缺血神经损伤中的枢纽调控机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

线粒体：放射性脑损伤的受害者还是施害者？

国家自然科学基金

0+阅读 · 2011年12月31日

新兴污染物HO-PBDEs在水相中的环境光化学活性

国家自然科学基金

0+阅读 · 2009年12月31日

仿射流形上的非线性分析

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员