Evaluating Copyright Takedown Methods for Language Models

Language models (LMs) derive their capabilities from extensive training on diverse data, including potentially copyrighted material. These models can memorize and generate content similar to their training data, posing potential concerns. Therefore, model creators are motivated to develop mitigation methods that prevent generating protected content. We term this procedure as copyright takedowns for LMs, noting the conceptual similarity to (but legal distinction from) the DMCA takedown This paper introduces the first evaluation of the feasibility and side effects of copyright takedowns for LMs. We propose CoTaEval, an evaluation framework to assess the effectiveness of copyright takedown methods, the impact on the model's ability to retain uncopyrightable factual knowledge from the training data whose recitation is embargoed, and how well the model maintains its general utility and efficiency. We examine several strategies, including adding system prompts, decoding-time filtering interventions, and unlearning approaches. Our findings indicate that no tested method excels across all metrics, showing significant room for research in this unique problem setting and indicating potential unresolved challenges for live policy proposals.

翻译：语言模型（LMs）的能力源于对多样化数据（包括可能受版权保护的材料）的广泛训练。这些模型能够记忆并生成与其训练数据相似的内容，这引发了潜在担忧。因此，模型开发者有动机开发缓解方法，以防止生成受保护内容。我们将此过程称为语言模型的版权移除，并指出其与《数字千年版权法案》（DMCA）移除在概念上的相似性（但法律上存在区别）。本文首次对语言模型版权移除的可行性及副作用进行了评估。我们提出了CoTaEval评估框架，用以评估版权移除方法的有效性、对模型保留训练数据中非版权可保护事实知识能力的影响（尽管这些知识的复述受到限制），以及模型维持其通用效用和效率的程度。我们检验了多种策略，包括添加系统提示、解码时过滤干预以及遗忘学习等方法。我们的研究结果表明，在所有测试方法中，没有一种能在所有指标上表现优异，这显示出在这一独特问题设定下存在显著的研究空间，并表明现行政策提案可能面临尚未解决的挑战。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日