This paper explores the complexities of automatic detection of software similarities, in relation to the unique challenges of digital artifacts, and introduces Project Martial, an open-source software solution for detecting code similarity. This research enumerates some of the existing approaches to counter software plagiarism by examining both the academia and legal landscape, including notable lawsuits and court rulings that have shaped the understanding of software copyright infringements in commercial applications. Furthermore, we categorize the classes of detection challenges based on the available artifacts, and we provide a survey of the previously studied techniques in the literature, including solutions based on fingerprinting, software birthmarks, or code embeddings, and exemplify how a subset of them can be applied in the context of Project Martial.
翻译:本文探讨了软件相似性自动检测的复杂性,及其与数字制品特有挑战的关联,并介绍了用于检测代码相似性的开源软件解决方案——Project Martial。本研究通过审视学术界与法律领域的现状,包括那些塑造了商业应用中软件版权侵权理解的重要诉讼与法庭判决,列举了当前应对软件剽窃的一些现有方法。此外,我们基于可获取的制品类型对检测挑战进行了分类,并对文献中先前研究的技术进行了综述,涵盖基于指纹识别、软件胎记或代码嵌入的解决方案,并举例说明了其中一部分技术如何在Project Martial的背景下得以应用。