Background: Code review, the discussion around a code change among humans, forms a communication network that enables its participants to exchange and spread information. Although reported by qualitative studies, our understanding of the capability of code review as a communication network is still limited. Objective: In this article, we report on a first step towards evaluating the capability of code review as a communication network by quantifying how fast and how far information can spread through code review: the upper bound of information diffusion in code review. Method: In an in-silico experiment, we simulate an artificial information diffusion within large (Microsoft), mid-sized (Spotify), and small code review systems (Trivago) modelled as communication networks. We then measure the minimal topological and temporal distances between the participants to quantify how far and how fast information can spread in code review. Results: An average code review participants in the small and mid-sized code review systems can spread information to between 72% and 85% of all code review participants within four weeks independently of network size and tooling; for the large code review systems, we found an absolute boundary of about 11000 reachable participants. On average (median), information can spread between two participants in code review in less than five hops and less than five days. Conclusion: We found evidence that the communication network emerging from code review scales well and spreads information fast and broadly, corroborating the findings of prior qualitative work. The study lays the foundation for understanding and improving code review as a communication network.
翻译:背景:代码审查,即围绕代码变更在人类之间展开的讨论,形成了一个使参与者能够交换和传播信息的通信网络。尽管已有定性研究报道,但我们对于代码审查作为通信网络的能力理解仍然有限。目标:在本文中,我们通过量化信息通过代码审查传播的速度和范围——即代码审查中信息扩散的上界——迈出了评估代码审查作为通信网络能力的第一步。方法:在一项计算机模拟实验中,我们在建模为通信网络的大型(微软)、中型(Spotify)和小型(Trivago)代码审查系统中模拟了人工信息扩散。随后,我们测量参与者之间的最小拓扑距离和时间距离,以量化信息在代码审查中传播的范围和速度。结果:在小型和中型代码审查系统中,平均每位代码审查参与者可以在四周内将信息传播至72%到85%的所有代码审查参与者,且不受网络规模和工具的影响;对于大型代码审查系统,我们发现可触及参与者的绝对边界约为11000人。平均而言(中位数),信息在代码审查中两个参与者之间传播所需跳数少于五跳,时间少于五天。结论:我们发现了证据表明,代码审查中涌现的通信网络具有良好的可扩展性,能够快速且广泛地传播信息,证实了先前定性研究的发现。本研究为理解和改进作为通信网络的代码审查奠定了基础。