ENCORE: Ensemble Learning using Convolution Neural Machine Translation for Automatic Program Repair

Automated generate-and-validate (G&V) program repair techniques typically rely on hard-coded rules, only fix bugs following specific patterns, and are hard to adapt to different programming languages. We propose ENCORE, a new G&V technique, which uses ensemble learning on convolutional neural machine translation (NMT) models to automatically fix bugs in multiple programming languages. We take advantage of the randomness in hyper-parameter tuning to build multiple models that fix different bugs and combine them using ensemble learning. This new convolutional NMT approach outperforms the standard long short-term memory (LSTM) approach used in previous work, as it better captures both local and long-distance connections between tokens. Our evaluation on two popular benchmarks, Defects4J and QuixBugs, shows that ENCORE fixed 42 bugs, including 16 that have not been fixed by existing techniques. In addition, ENCORE is the first G&V repair technique to be applied to four popular programming languages (Java, C++, Python, and JavaScript), fixing a total of 67 bugs across five benchmarks.

翻译：自动生成-验证（G&V）程序修复技术通常依赖硬编码规则，仅能修复符合特定模式的缺陷，且难以适应不同编程语言。我们提出ENCoRE——一种新型G&V技术，该技术采用卷积神经机器翻译（NMT）模型的集成学习，自动修复多种编程语言的缺陷。通过利用超参数调优中的随机性构建多个能修复不同缺陷的模型，并运用集成学习进行组合优化。这种新型卷积NMT方法优于先前工作中使用的标准长短期记忆（LSTM）方法，能更有效地捕获标记之间的局部与长距离连接关系。在Defects4J和QuixBugs两个主流基准测试集上的评估结果显示，ENCoRE共修复42个缺陷，其中16个为现有技术未能修复的缺陷。此外，ENCoRE是首个可应用于四种主流编程语言（Java、C++、Python和JavaScript）的G&V修复技术，在五个基准测试集上合计修复67个缺陷。

相关内容

Machine Translation

关注 210

机器翻译（Machine Translation）涵盖计算语言学和语言工程的所有分支，包含多语言方面。特色论文涵盖理论，描述或计算方面的任何下列主题:双语和多语语料库的编写和使用，计算机辅助语言教学，非罗马字符集的计算含义，连接主义翻译方法，对比语言学等。官网地址：http://dblp.uni-trier.de/db/journals/mt/

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日