The NPM package repository contains over two million packages and serves tens of billions of downloads per-week. Nearly every single JavaScript application uses the NPM package manager to install packages from the NPM repository. NPM relies on a "semantic versioning" ('semver') scheme to maintain a healthy ecosystem, where bug-fixes are reliably delivered to downstream packages as quickly as possible, while breaking changes require manual intervention by downstream package maintainers. In order to understand how developers use semver, we build a dataset containing every version of every package on NPM and analyze the flow of updates throughout the ecosystem. We build a time-travelling dependency resolver for NPM, which allows us to determine precisely which versions of each dependency would have been resolved at different times. We segment our analysis to allow for a direct analysis of security-relevant updates (those that introduce or patch vulnerabilities) in comparison to the rest of the ecosystem. We find that when developers use semver correctly, critical updates such as security patches can flow quite rapidly to downstream dependencies in the majority of cases (90.09%), but this does not always occur, due to developers' imperfect use of both semver version constraints and semver version number increments. Our findings have implications for developers and researchers alike. We make our infrastructure and dataset publicly available under an open source license.
翻译:NPM包仓库包含超过两百万个包,每周服务数百亿次下载。几乎所有JavaScript应用程序都使用NPM包管理器从NPM仓库安装包。NPM依赖"语义化版本控制"方案来维护健康的生态系统,在该方案中,错误修复能够尽快可靠地传递给下游包,而破坏性变更则需要下游包维护者进行手动干预。为了理解开发者如何使用语义化版本,我们构建了一个包含NPM上每个包所有版本的数据集,并分析了整个生态系统中更新的流动。我们为NPM构建了一个时间旅行依赖解析器,该解析器能够精确确定在不同时间点每个依赖项被解析的具体版本。我们将分析进行分段,以直接对比安全相关更新(引入或修补漏洞的更新)与生态系统其他部分。我们发现,当开发者正确使用语义化版本时,关键更新(如安全补丁)在大多数情况下(90.09%)能够快速流向下游依赖项,但由于开发者对语义化版本约束和语义化版本号递增的不完美使用,这种情况并非总是发生。我们的发现对开发者和研究者均有启示。我们以开源许可证公开提供我们的基础设施和数据集。