As machine learning tools progress, the inevitable question arises: How can machine learning help us write better code? With significant progress being achieved in natural language processing with models like GPT-3 and Bert, the applications of natural language processing techniques to code are starting to be explored. Most of the research has been focused on automatic program repair (APR), and while the results on synthetic or highly filtered datasets are promising, such models are hard to apply in real-world scenarios because of inadequate bug localization. We propose BigIssue: a benchmark for realistic bug localization. The goal of the benchmark is two-fold. We provide (1) a general benchmark with a diversity of real and synthetic Java bugs and (2) a motivation to improve bug localization capabilities of models through attention to the full repository context. With the introduction of BigIssue, we hope to advance the state of the art in bug localization, in turn improving APR performance and increasing its applicability to the modern development cycle.
翻译:随着机器学习工具的进步,一个不可避免的问题随之而来:机器学习如何帮助我们编写更好的代码?随着GPT-3和BERT等模型在自然语言处理领域取得显著进展,将自然语言处理技术应用于代码的探索已开始展开。大多数研究集中于自动程序修复(APR),尽管在合成或高度过滤的数据集上取得了令人鼓舞的结果,但由于缺陷定位能力不足,这些模型难以应用于实际场景。我们提出BigIssue:一个面向现实缺陷定位的基准。该基准的目标有两方面:我们提供(1)一个包含真实与合成Java缺陷多样性的通用基准,以及(2)一种通过关注完整仓库上下文来提升模型缺陷定位能力的动机。借助BigIssue的引入,我们期望推动缺陷定位领域的最新技术发展,进而提升APR性能并增强其在现代开发流程中的适用性。