To create unit tests, it may be necessary to refactor the production code, e.g. by widening access to specific methods or by decomposing classes into smaller units that are easier to test independently. We report on an extensive study to understand such composite refactoring procedures for the purpose of improving testability. We collected and studied 346,841 java pull requests from 621 GitHub projects. First, we compared the atomic refactorings in two populations: pull requests with changed test-pairs (i.e. with co-changes in production and test code and thus potentially including testability refactoring) and pull requests without test-pairs. We found significantly more atomic refactorings in test-pairs pull requests, such as Change Variable Type Operation or Change Parameter Type. Second, we manually analyzed the code changes of 200 pull requests, where developers explicitly mention the terms "testability" or "refactor + test". We identified ten composite refactoring procedures for the purpose of testability, which we call testability refactoring patterns. Third, we manually analyzed additional 524 test-pairs pull requests: both randomly selected and where we assumed to find testability refactorings, e.g. in pull requests about dependency or concurrency issues. About 25% of all analyzed pull requests actually included testability refactoring patterns. The most frequent were extract a method for override or for invocation, widen access to a method for invocation, and extract a class for invocation. We also report on frequent atomic refactorings which co-occur with the patterns and discuss the implications of our findings for research, practice, and education
翻译:为了编写单元测试,通常需要重构生产代码,例如通过扩大特定方法的访问权限,或将类分解为更易于独立测试的较小单元。我们开展了一项广泛研究,旨在理解此类为提升可测试性而进行的复合重构过程。我们收集并研究了来自621个GitHub项目的346,841个Java Pull Request。首先,我们比较了两类代码集合中的原子重构操作:包含变更测试对的Pull Request(即生产代码与测试代码共同变更,因此可能包含可测试性重构)与不包含测试对的Pull Request。我们发现,在测试对Pull Request中,诸如“更改变量类型操作”或“更改参数类型”等原子重构操作显著更多。其次,我们手动分析了200个Pull Request的代码变更,其中开发者明确提及“可测试性”或“重构+测试”等术语。我们识别出十种面向可测试性的复合重构过程,并将其称为可测试性重构模式。第三,我们额外手动分析了524个测试对Pull Request:包括随机选取的样本,以及我们假设存在可测试性重构的样本(例如涉及依赖性或并发问题的Pull Request)。在所有分析的Pull Request中,约25%实际包含了可测试性重构模式。最常见的模式包括:提取方法以供重写或调用、扩大方法访问权限以供调用,以及提取类以供调用。我们还报告了与这些模式共现的常见原子重构操作,并讨论了我们的发现对研究、实践及教育的启示。