What Makes a Code Review Useful to OpenDev Developers? An Empirical Investigation

Context: Due to the association of significant efforts, even a minor improvement in the effectiveness of Code Reviews(CR) can incur significant savings for a software development organization. Aim: This study aims to develop a finer grain understanding of what makes a code review comment useful to OSS developers, to what extent a code review comment is considered useful to them, and how various contextual and participant-related factors influence its usefulness level. Method: On this goal, we have conducted a three-stage mixed-method study. We randomly selected 2,500 CR comments from the OpenDev Nova project and manually categorized the comments. We designed a survey of OpenDev developers to better understand their perspectives on useful CRs. Combining our survey-obtained scores with our manually labeled dataset, we trained two regression models - one to identify factors that influence the usefulness of CR comments and the other to identify factors that improve the odds of `Functional' defect identification over the others. Key findings: The results of our study suggest that a CR comment's usefulness is dictated not only by its technical contributions such as defect findings or quality improvement tips but also by its linguistic characteristics such as comprehensibility and politeness. While a reviewer's coding experience positively associates with CR usefulness, the number of mutual reviews, comment volume in a file, the total number of lines added /modified, and CR interval has the opposite associations. While authorship and reviewership experiences for the files under review have been the most popular attributes for reviewer recommendation systems, we do not find any significant association of those attributes with CR usefulness.

翻译：背景：由于代码审查（CR）与大量努力的关联，即使其有效性发生微小改进，也能为软件开发组织带来显著的成本节约。目的：本研究旨在更细致地理解是什么使得代码审查评论对开源软件（OSS）开发者有用，代码审查评论在多大程度上被认为对他们有用，以及各种上下文和参与者相关因素如何影响其有用性水平。方法：为此，我们进行了一项三阶段混合方法研究。我们从OpenDev Nova项目中随机选取了2500条代码审查评论，并手动对其进行了分类。我们设计了一项针对OpenDev开发者的调查，以更好地理解他们对有用代码审查的看法。结合从调查中获得的评分与手动标注的数据集，我们训练了两个回归模型——一个用于识别影响代码审查评论有用性的因素，另一个用于识别提高“功能性”缺陷识别相对于其他因素的概率的因素。主要发现：我们的研究结果表明，代码审查评论的有用性不仅取决于其技术贡献（如缺陷发现或质量改进建议），还取决于其语言特征（如可理解性和礼貌性）。虽然审查者的编程经验与代码审查有用性正相关，但相互审查次数、文件中的评论量、添加/修改的总行数以及代码审查间隔则呈负相关。尽管对待审查文件的作者经验和审查者经验是审查者推荐系统中最受欢迎的属性，但我们并未发现这些属性与代码审查有用性之间存在显著关联。