Copyright Traps for Large Language Models

Questions of fair use of copyright-protected content to train Large Language Models (LLMs) are being very actively debated. Document-level inference has been proposed as a new task: inferring from black-box access to the trained model whether a piece of content has been seen during training. SOTA methods however rely on naturally occurring memorization of (part of) the content. While very effective against models that memorize a lot, we hypothesize--and later confirm--that they will not work against models that do not naturally memorize, e.g. medium-size 1B models. We here propose to use copyright traps, the inclusion of fictitious entries in original content, to detect the use of copyrighted materials in LLMs with a focus on models where memorization does not naturally occur. We carefully design an experimental setup, randomly inserting traps into original content (books) and train a 1.3B LLM. We first validate that the use of content in our target model would be undetectable using existing methods. We then show, contrary to intuition, that even medium-length trap sentences repeated a significant number of times (100) are not detectable using existing methods. However, we show that longer sequences repeated a large number of times can be reliably detected (AUC=0.75) and used as copyright traps. We further improve these results by studying how the number of times a sequence is seen improves detectability, how sequences with higher perplexity tend to be memorized more, and how taking context into account further improves detectability.

翻译：关于使用受版权保护内容训练大型语言模型（LLM）的合理使用问题正引发热烈讨论。文档级推理被提出作为一项新任务：通过黑盒访问训练模型，推断训练过程中是否见过某段内容。然而，现有最先进方法依赖内容（部分）的自然记忆现象。虽然此类方法对大量记忆的模型效果显著，但我们假设——并经后续验证——其对不易自然记忆的模型（如中等规模的1B参数模型）无效。为此，我们提出使用版权陷阱——在原始内容中嵌入虚构条目——来检测LLM对受版权素材的使用情况，尤其针对不易自然记忆的模型。我们精心设计实验方案，随机向原始内容（书籍）中插入陷阱，并训练1.3B参数的LLM。首先验证了若使用现有方法，无法检测到目标模型对相关内容的使用。随后证明，与直觉相反，即使中等长度的陷阱句子重复出现显著次数（100次），现有方法仍无法检测。但实验表明，长序列重复大量出现后（AUC=0.75）可被可靠检测并用作版权陷阱。我们进一步优化结果，研究了序列重复次数对可检测性的提升规律、高困惑度序列更易被记忆的现象，以及结合上下文可进一步提升检测性能。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日