A Case Study of LLM for Automated Vulnerability Repair: Assessing Impact of Reasoning and Patch Validation Feedback

Recent work in automated program repair (APR) proposes the use of reasoning and patch validation feedback to reduce the semantic gap between the LLMs and the code under analysis. The idea has been shown to perform well for general APR, but its effectiveness in other particular contexts remains underexplored. In this work, we assess the impact of reasoning and patch validation feedback to LLMs in the context of vulnerability repair, an important and challenging task in security. To support the evaluation, we present VRpilot, an LLM-based vulnerability repair technique based on reasoning and patch validation feedback. VRpilot (1) uses a chain-of-thought prompt to reason about a vulnerability prior to generating patch candidates and (2) iteratively refines prompts according to the output of external tools (e.g., compiler, code sanitizers, test suite, etc.) on previously-generated patches. To evaluate performance, we compare VRpilot against the state-of-the-art vulnerability repair techniques for C and Java using public datasets from the literature. Our results show that VRpilot generates, on average, 14% and 7.6% more correct patches than the baseline techniques on C and Java, respectively. We show, through an ablation study, that reasoning and patch validation feedback are critical. We report several lessons from this study and potential directions for advancing LLM-empowered vulnerability repair

翻译：自动化程序修复（APR）领域的最新研究提出利用推理与补丁验证反馈来缩小大型语言模型（LLM）与待分析代码之间的语义鸿沟。该思路在通用APR任务中已表现出良好性能，但其在其他特定场景中的有效性仍有待深入探索。本研究在安全领域重要且富有挑战性的漏洞修复任务背景下，系统评估了推理与补丁验证反馈对LLM的影响。为支撑评估工作，我们提出了VRpilot——一种基于推理与补丁验证反馈的LLM漏洞修复技术。VRpilot具备两大特征：（1）在生成补丁候选方案前，采用思维链提示对漏洞进行推理分析；（2）依据外部工具（如编译器、代码净化器、测试套件等）对历史生成补丁的反馈结果，迭代优化提示策略。为评估性能，我们使用文献中的公开数据集，将VRpilot与当前最先进的C语言及Java漏洞修复技术进行对比。实验结果表明：在C语言和Java数据集上，VRpilot平均分别比基线技术多生成14%和7.6%的正确补丁。通过消融实验，我们验证了推理机制与补丁验证反馈的关键作用。本研究总结出若干重要发现，并为推进LLM赋能的漏洞修复技术提出了潜在发展方向。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日