Continuous integration at scale is costly but essential to software development. Various test optimization techniques including test selection and prioritization aim to reduce the cost. Test batching is an effective alternative, but overlooked technique. This study evaluates parallelization's effect by adjusting machine count for test batching and introduces two novel approaches. We establish TestAll as a baseline to study the impact of parallelism and machine count on feedback time. We re-evaluate ConstantBatching and introduce DynamicBatching, which adapts batch size based on the remaining changes in the queue. We also propose TestCaseBatching, enabling new builds to join a batch before full test execution, thus speeding up continuous integration. Our evaluations utilize Ericsson's results and 276 million test outcomes from open-source Chrome, assessing feedback time, execution reduction, and providing access to Chrome project scripts and data. The results reveal a non-linear impact of test parallelization on feedback time, as each test delay compounds across the entire test queue. ConstantBatching, with a batch size of 4, utilizes up to 72% fewer machines to maintain the actual average feedback time and provides a constant execution reduction of up to 75%. Similarly, DynamicBatching maintains the actual average feedback time with up to 91% fewer machines and exhibits variable execution reduction of up to 99%. TestCaseBatching holds the line of the actual average feedback time with up to 81% fewer machines and demonstrates variable execution reduction of up to 67%. We recommend practitioners use DynamicBatching and TestCaseBatching to reduce the required testing machines efficiently. Analyzing historical data to find the threshold where adding more machines has minimal impact on feedback time is also crucial for resource-effective testing.
翻译:大规模持续集成成本高昂,但对软件开发至关重要。包括测试选择与优先级排序在内的多种测试优化技术旨在降低成本。测试批处理是一种有效但被忽视的替代技术。本研究通过调整测试批处理中的机器数量评估并行化的效果,并引入了两种新方法。我们建立TestAll作为基线,以研究并行性与机器数量对反馈时间的影响。我们重新评估了ConstantBatching,并引入DynamicBatching,该方法根据队列中剩余变更量自适应调整批处理大小。我们还提出了TestCaseBatching,允许新构建在完整测试执行前加入批处理,从而加速持续集成。我们的评估利用了爱立信的实验结果以及来自开源Chrome的2.76亿个测试结果,评估反馈时间、执行缩减量,并提供Chrome项目脚本与数据的访问。结果显示,测试并行化对反馈时间具有非线性影响,因为每个测试的延迟会在整个测试队列中累积。采用批处理大小为4的ConstantBatching,可减少最多72%的机器以维持实际平均反馈时间,并提供高达75%的恒定执行缩减量。类似地,DynamicBatching以减少最多91%的机器维持实际平均反馈时间,并实现高达99%的可变执行缩减量。TestCaseBatching以减少最多81%的机器维持实际平均反馈时间,并展示高达67%的可变执行缩减量。我们建议从业者使用DynamicBatching与TestCaseBatching高效减少所需测试机器。分析历史数据以发现增加更多机器对反馈时间影响最小的阈值,对于资源高效的测试也至关重要。