Bug-fix benchmarks are essential for evaluating methodologies in automatic program repair (APR) and fault localization (FL). However, existing benchmarks, exemplified by Defects4J, need to evolve to incorporate recent bug-fixes aligned with contemporary development practices. Moreover, reproducibility, a key scientific principle, has been lacking in bug-fix benchmarks. To address these gaps, we present GitBug-Java, a reproducible benchmark of recent Java bugs. GitBug-Java features 199 bugs extracted from the 2023 commit history of 55 notable open-source repositories. The methodology for building GitBug-Java ensures the preservation of bug-fixes in fully-reproducible environments. We publish GitBug-Java at https://github.com/gitbugactions/gitbug-java.
翻译:缺陷修复基准测试集对于评估自动程序修复(APR)和故障定位(FL)的方法至关重要。然而,以Defects4J为代表的现有基准测试集需要更新,以纳入符合当代开发实践的近期缺陷修复案例。此外,作为关键科学原则的可复现性,在缺陷修复基准测试集中一直有所缺失。为弥补这些不足,我们提出了GitBug-Java——一个可复现的近期Java缺陷基准测试集。GitBug-Java包含从2023年55个知名开源仓库的提交历史中提取的199个缺陷。构建GitBug-Java的方法确保了缺陷修复案例在完全可复现的环境中得到保留。我们将GitBug-Java发布于https://github.com/gitbugactions/gitbug-java。