The reuse and distribution of open-source software must be in compliance with its accompanying open-source license. In modern packaging ecosystems, maintaining such compliance is challenging because a package may have a complex multi-layered dependency graph with many packages, any of which may have an incompatible license. Although prior research finds that license incompatibilities are prevalent, empirical evidence is still scarce in some modern packaging ecosystems (e.g., PyPI). It also remains unclear how developers remediate the license incompatibilities in the dependency graphs of their packages (including direct and transitive dependencies), let alone any automated approaches. To bridge this gap, we conduct a large-scale empirical study of license incompatibilities and their remediation practices in the PyPI ecosystem. We find that 7.27% of the PyPI package releases have license incompatibilities and 61.3% of them are caused by transitive dependencies, causing challenges in their remediation; for remediation, developers can apply one of the five strategies: migration, removal, pinning versions, changing their own licenses, and negotiation. Inspired by our findings, we propose SILENCE, an SMT-solver-based approach to recommend license incompatibility remediations with minimal costs in package dependency graph. Our evaluation shows that the remediations proposed by SILENCE can match 19 historical real-world cases (except for migrations not covered by an existing knowledge base) and have been accepted by five popular PyPI packages whose developers were previously unaware of their license incompatibilities.
翻译:开源软件的重用与分发必须遵守其附带的开源许可证。在现代化打包生态系统中,维护此类合规性面临挑战,因为一个软件包可能拥有包含众多包的复杂多层依赖图,其中任意包都可能存在不兼容许可证。尽管先前研究发现许可证不兼容问题普遍存在,但在某些现代打包生态系统(如PyPI)中,实证证据仍然匮乏。此外,开发者如何修复其软件包依赖图(包括直接依赖和传递依赖)中的许可证不兼容问题仍不明确,更遑论自动化方法。为填补这一空白,我们对PyPI生态系统的许可证不兼容问题及其修复实践开展了大规模实证研究。研究发现,7.27%的PyPI包版本存在许可证不兼容问题,其中61.3%由传递依赖导致,这增加了修复难度;在修复方面,开发者可采用迁移、删除、版本锁定、更改自有许可证以及协商五种策略。受此启发,我们提出SILENCE——一种基于SMT求解器的方法,用于以最小代价推荐包依赖图中的许可证不兼容修复方案。评估表明,SILENCE推荐的修复方案可匹配19个历史真实案例(除涉及现有知识库未覆盖的迁移外),并已被五个此前未意识到许可证不兼容问题的热门PyPI包采纳。