Impact of Code Language Models on Automated Program Repair

Automated program repair (APR) aims to help developers improve software reliability by generating patches for buggy programs. Although many code language models (CLM) are developed and effective in many software tasks such as code completion, there has been little comprehensive, in-depth work to evaluate CLMs' fixing capabilities and to fine-tune CLMs for the APR task. Firstly, this work is the first to evaluate ten CLMs on four APR benchmarks, which shows that surprisingly, the best CLM, as is, fixes 72% more bugs than the state-of-the-art deep-learning (DL)-based APR techniques. Secondly, one of the four APR benchmarks was created by us in this paper to avoid data leaking for a fair evaluation. Thirdly, it is the first work to fine-tune CLMs with APR training data, which shows that fine-tuning brings 31%-1,267% improvement to CLMs and enables them to fix 46%-164% more bugs than existing DL-based APR techniques. Fourthly, this work studies the impact of buggy lines, showing that CLMs, as is, cannot make good use of the buggy lines to fix bugs, yet fine-tuned CLMs could potentially over-rely on buggy lines. Lastly, this work analyzes the size, time, and memory efficiency of different CLMs. This work shows promising directions for the APR domain, such as fine-tuning CLMs with APR-specific designs, and also raises awareness of fair and comprehensive evaluations of CLMs and calls for more transparent reporting of open-source repositories used in the pre-training data to address the data leaking problem.

翻译：自动程序修复旨在通过生成补丁来帮助开发者提升软件可靠性。尽管许多代码语言模型在代码补全等软件任务中表现出色，但目前鲜有全面、深入的工作评估CLM的修复能力，或针对APR任务对其进行微调。首先，本文首次在四个APR基准数据集上评估了十种CLM，结果令人惊讶：未经微调的最佳CLM比现有最先进的基于深度学习的APR技术多修复了72%的错误。其次，为避免数据泄露以确保评估公平性，本文创建了四个基准数据集中的一个。第三，本文首次使用APR训练数据对CLM进行微调，结果表明微调可使CLM的修复能力提升31%至1267%，并使它们比现有基于深度学习的APR技术多修复46%至164%的错误。第四，本研究分析了错误代码行的影响，发现未经微调的CLM无法有效利用错误代码行进行修复，而微调后的CLM可能过度依赖错误代码行。最后，本文分析了不同CLM的规模、时间和内存效率。本研究为APR领域指明了有前景的方向，例如结合APR特定设计进行CLM微调，同时呼吁对CLM进行公平、全面的评估，并提倡更透明地报告预训练数据中使用的开源代码库，以解决数据泄露问题。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日