Factoring Expertise, Workload, and Turnover into Code Review Recommendation

Developer turnover is inevitable on software projects and leads to knowledge loss, a reduction in productivity, and an increase in defects. Mitigation strategies to deal with turnover tend to disrupt and increase workloads for developers. In this work, we suggest that through code review recommendation we can distribute knowledge and mitigate turnover while more evenly distributing review workload. We conduct historical analyses to understand the natural concentration of review workload and the degree of knowledge spreading that is inherent in code review. Even though review workload is highly concentrated, we show that code review natural spreads knowledge thereby reducing the files at risk to turnover. Using simulation, we evaluate existing code review recommenders and develop novel recommenders to understand their impact on the level of expertise during review, the workload of reviewers, and the files at risk to turnover. Our simulations use seeded random replacement of reviewers to allow us to compare the reviewer recommenders without the confounding variation of different reviewers being replaced for each recommender. Combining recommenders, we develop the SofiaWL recommender that suggests experts with low active review workload when none of the files under review are known by only one developer. In contrast, when knowledge is concentrated on one developer, it sends the review to other reviewers to spread knowledge. For the projects we study, we are able to globally increase expertise during reviews, +3%, reduce workload concentration, -12%, and reduce the files at risk, -28%. We make our scripts and data available in our replication package. Developers can optimize for a particular outcome measure based on the needs of their project, or use our GitHub bot to automatically balance the outcomes.

翻译：开发人员流动在软件项目中不可避免，会导致知识流失、生产力下降和缺陷增加。应对人员流动的缓解策略往往会干扰开发人员的工作并增加其工作负荷。在本研究中，我们提出通过代码审查推荐可以传播知识、缓解人员流动，同时更均匀地分配审查工作负载。我们进行历史分析以理解审查工作负载的自然集中程度以及代码审查中固有的知识扩散程度。尽管审查工作负载高度集中，但我们证明代码审查能自然地传播知识，从而减少面临人员流动风险的文件。通过仿真，我们评估现有的代码审查推荐器，并开发新的推荐器，以了解它们对审查过程中的专业水平、审查者工作负载以及面临人员流动风险的文件的影响。我们的仿真采用种子随机替换审查者的方法，这样我们可以在不同推荐器对应的不同审查者被替换的混杂变量影响下，比较审查者推荐器。通过组合推荐器，我们开发了SofiaWL推荐器：当被审查文件中没有仅由一名开发人员知晓的情况时，它会推荐当前活跃审查工作负载较低的专家；反之，当知识集中在某一名开发人员身上时，它会将审查任务发送给其他审查者以传播知识。在我们研究的项目中，我们得以在全球范围内提升审查过程中的专业水平（+3%）、降低工作负载集中度（-12%），并减少面临风险的文件数量（-28%）。我们已在可复现包中提供脚本和数据。开发人员可根据项目需求优化特定结果指标，或使用我们的GitHub机器人自动平衡各项结果。