Bug-fix benchmarks are essential for evaluating methodologies in automatic program repair (APR) and fault localization (FL). However, existing benchmarks, exemplified by Defects4J, need to evolve to incorporate recent bug-fixes aligned with contemporary development practices. Moreover, reproducibility, a key scientific principle, has been lacking in bug-fix benchmarks. To address these gaps, we present GitBug-Java, a reproducible benchmark of recent Java bugs. GitBug-Java features 199 bugs extracted from the 2023 commit history of 55 notable open-source repositories. The methodology for building GitBug-Java ensures the preservation of bug-fixes in fully-reproducible environments. We publish GitBug-Java at https://github.com/gitbugactions/gitbug-java.
翻译:错误修复基准测试对于评估自动程序修复(APR)和故障定位(FL)方法至关重要。然而,现有的基准测试(例如Defects4J)需要不断演进,以纳入符合当代开发实践的近期错误修复。此外,可复现性作为一项关键科学原则,在错误修复基准测试中一直有所欠缺。为弥补这些不足,我们提出了GitBug-Java——一个面向近期Java错误且具备可复现性的基准测试。GitBug-Java包含从2023年55个知名开源仓库提交历史中提取的199个错误。构建GitBug-Java的方法确保了错误修复在完全可复现的环境中得以保留。我们将GitBug-Java发布在https://github.com/gitbugactions/gitbug-java。