Open source software (OSS) licenses regulate the conditions under which OSS can be legally reused, distributed, and modified. However, a common issue arises when incorporating third-party OSS accompanied with licenses, i.e., license incompatibility, which occurs when multiple licenses exist in one project and there are conflicts between them. Despite being problematic, fixing license incompatibility issues requires substantial efforts due to the lack of license understanding and complex package dependency. In this paper, we propose LiResolver, a fine-grained, scalable, and flexible tool to resolve license incompatibility issues for open source software. Specifically, it first understands the semantics of licenses through fine-grained entity extraction and relation extraction. Then, it detects and resolves license incompatibility issues by recommending official licenses in priority. When no official licenses can satisfy the constraints, it generates a custom license as an alternative solution. Comprehensive experiments demonstrate the effectiveness of LiResolver, with 4.09% false positive (FP) rate and 0.02% false negative (FN) rate for incompatibility issue localization, and 62.61% of 230 real-world incompatible projects resolved by LiResolver. We discuss the feedback from OSS developers and the lessons learned from this work. All the datasets and the replication package of LiResolver have been made publicly available to facilitate follow-up research.
翻译:开源软件(OSS)许可证规定了开源软件可合法复用、分发及修改的条件。然而,在集成附带许可证的第三方开源软件时常出现一个问题,即许可证不兼容性:同一项目中存在多个许可证且彼此间存在冲突。尽管问题突出,但由于缺乏对许可证的理解以及复杂的包依赖关系,修复许可证不兼容性问题需要大量工作。本文提出LiResolver——一种细粒度、可扩展且灵活的工具,用于解决开源软件的许可证不兼容性问题。具体而言,该工具首先通过细粒度实体抽取与关系抽取理解许可证语义;随后通过优先推荐官方许可证来检测并解决许可证不兼容性问题。当无官方许可证能满足约束条件时,它生成自定义许可证作为替代方案。综合实验证明了LiResolver的有效性:在不兼容问题定位中假阳性率为4.09%、假阴性率为0.02%,且该工具可解决230个真实世界不兼容项目中的62.61%。我们讨论了来自开源软件开发者的反馈及该工作中的经验教训。LiResolver的所有数据集及可复现包已公开,以促进后续研究。