Existing innovation metrics inadequately capture software innovation, creating blind spots for researchers and policymakers seeking to understand and foster technological innovation in an increasingly software-defined economy. This paper introduces a novel measure of software innovation based on open source software (OSS) development activity on GitHub. We examine the dependency growth and release complexity among 350,000 unique releases from 33,000 unique packages across the JavaScript, Python, and Ruby ecosystems over two years post-release. We find that the semantic versioning types of OSS releases exhibit ecosystem-specific and maturity-dependent patterns in predicting one-year dependency growth, with minor releases showing relatively consistent adoption across contexts while major and patch releases vary significantly by ecosystem and package size. In addition, while semantic versioning correlates with the technical complexity of the change-set, complexity itself shows minimal correlation with downstream adoption, suggesting that versioning signals rather than technical change drive dependency growth. Overall, while semantic versioning release information can be used as a unit of innovation in OSS development complementary to common sources for innovation metrics (e.g. scientific publications, patents, and standards), this measure should be weighted by ecosystem culture, package maturity, and release type to accurately capture innovation dynamics. We conclude with a discussion of the theoretical and practical implications of this novel measure of software innovation as well as future research directions.
翻译:现有创新指标难以充分捕捉软件创新,为试图理解并促进日益软件化经济中技术创新的研究人员和政策制定者造成了盲区。本文提出一种基于GitHub开源软件开发活动的新型软件创新度量方法。我们追踪了JavaScript、Python和Ruby生态系统中33,000个独立软件包在发布后两年内的35万个独立版本,分析了其依赖增长与发布复杂度。研究发现,开源软件发布的语义化版本类型在预测一年期依赖增长时呈现出生态系统特异性与成熟度依赖模式:次要版本在不同情境下表现出相对一致的采纳度,而主要版本和补丁版本则随生态系统和软件包规模呈现显著差异。此外,虽然语义化版本与变更集的技术复杂度存在相关性,但复杂度本身与下游采纳度仅呈现最小关联,这表明驱动依赖增长的是版本信号而非技术变更本身。总体而言,虽然语义化版本信息可作为开源软件开发中的创新度量单元,与常见的创新指标来源(如科学出版物、专利和标准)形成互补,但该度量需结合生态系统文化、软件包成熟度和发布类型进行加权处理,方能准确捕捉创新动态。最后,本文讨论了这种新型软件创新度量方法的理论与实践意义,以及未来研究方向。