Context: Free and Open Source Software (FOSS) communities' ability to stay viable and productive over time is pivotal for society as they maintain the building blocks that digital infrastructure, products, and services depend on. Sustainability may, however, be characterized from multiple aspects, and less is known how these aspects interplay and impact community outputs, and software quality specifically. Objective: This study, therefore, aims to empirically explore how the different aspects of FOSS sustainability impact software quality. Method: 16 sustainability metrics across four categories were sampled and applied to a set of 217 OSS projects sourced from the Apache Software Foundation Incubator program. The impact of a decline in the sustainability metrics was analyzed against eight software quality metrics using Bayesian data analysis, which incorporates probability distributions to represent the regression coefficients and intercepts. Results: Findings suggest that selected sustainability metrics do not significantly affect defect density or code coverage. However, a positive impact of community age was observed on specific code quality metrics, such as risk complexity, number of very large files, and code duplication percentage. Interestingly, findings show that even when communities are experiencing sustainability, certain code quality metrics are negatively impacted. Conclusion: Findings imply that code quality practices are not consistently linked to sustainability, and defect management and prevention may be prioritized over the former. Results suggest that growth, resulting in a more complex and large codebase, combined with a probable lack of understanding of code quality standards, may explain the degradation in certain aspects of code quality.
翻译:背景:自由开源软件(FOSS)社区的长期生存与产出能力对社会至关重要,因为它们维系着数字基础设施、产品和服务所依赖的基础组件。然而,可持续性可从多个角度定义,且不同方面如何相互影响并作用于社区产出——尤其是软件质量——仍知之甚少。目标:本研究旨在实证探索FOSS可持续性的不同方面如何影响软件质量。方法:从四个类别中选取16个可持续性指标,应用于Apache软件基金会孵化器项目的217个开源项目。通过贝叶斯数据分析(该方法利用概率分布表示回归系数和截距),分析可持续性指标下降对八个软件质量指标的影响。结果:研究发现,所选可持续性指标对缺陷密度或代码覆盖率无显著影响。但社区年龄对特定代码质量指标(如风险复杂度、超大文件数量及代码重复率)有积极影响。有趣的是,即使社区保持可持续性,某些代码质量指标仍会呈现负面变化。结论:研究结果表明,代码质量实践与可持续性之间并非始终关联,缺陷管理与预防可能优先于前者。社区增长导致代码库更复杂庞大,加之可能缺乏对代码质量标准的理解,或可解释某些代码质量指标的退化。