Proving membership in LLM pretraining data via data watermarks

Detecting whether copyright holders' works were used in LLM pretraining is poised to be an important problem. This work proposes using data watermarks to enable principled detection with only black-box model access, provided that the rightholder contributed multiple training documents and watermarked them before public release. By applying a randomly sampled data watermark, detection can be framed as hypothesis testing, which provides guarantees on the false detection rate. We study two watermarks: one that inserts random sequences, and another that randomly substitutes characters with Unicode lookalikes. We first show how three aspects of watermark design -- watermark length, number of duplications, and interference -- affect the power of the hypothesis test. Next, we study how a watermark's detection strength changes under model and dataset scaling: while increasing the dataset size decreases the strength of the watermark, watermarks remain strong if the model size also increases. Finally, we view SHA hashes as natural watermarks and show that we can robustly detect hashes from BLOOM-176B's training data, as long as they occurred at least 90 times. Together, our results point towards a promising future for data watermarks in real world use.

翻译：检测版权持有者的作品是否被用于大语言模型（LLM）预训练，正成为一个重要问题。本文提出利用数据水印实现原则性检测，仅需黑盒模型访问权限，前提是权利人在公开发布前已贡献多个训练文档并为其添加水印。通过应用随机采样的数据水印，可将检测问题转化为假设检验，从而为错误检测率提供理论保障。我们研究了两种水印方法：一种插入随机序列，另一种用Unicode同形字符随机替换字符。首先揭示水印设计的三个维度——水印长度、重复次数与干扰——如何影响假设检验的统计功效。其次，研究模型规模和数据集规模扩展对水印检测强度的影响：虽然增大数据集会削弱水印强度，但若同时扩大模型规模，水印仍能保持较强检测能力。最后，将SHA哈希视为天然水印，证明当BLOOM-176B的训练数据中某哈希出现至少90次时，可稳健检测该哈希的存在。综合结果表明，数据水印在实际应用中具有广阔前景。

相关内容

大语言模型

关注 67

大语言模型是基于海量文本数据训练的深度学习模型。它不仅能够生成自然语言文本，还能够深入理解文本含义，处理各种自然语言任务，如文本摘要、问答、翻译等。2023年，大语言模型及其在人工智能领域的应用已成为全球科技研究的热点，其在规模上的增长尤为引人注目，参数量已从最初的十几亿跃升到如今的一万亿。参数量的提升使得模型能够更加精细地捕捉人类语言微妙之处，更加深入地理解人类语言的复杂性。在过去的一年里，大语言模型在吸纳新知识、分解复杂任务以及图文对齐等多方面都有显著提升。随着技术的不断成熟，它将不断拓展其应用范围，为人类提供更加智能化和个性化的服务，进一步改善人们的生活和生产方式。

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【WSDM2020】超越统计关系：将知识关系整合到多标签音乐风格分类的风格关联中（附pdf）

专知会员服务

18+阅读 · 2019年11月23日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日