Is Your Writing Being Mimicked by AI? Unveiling Imitation with Invisible Watermarks in Creative Writing

Efficient knowledge injection methods for Large Language Models (LLMs), such as In-Context Learning, knowledge editing, and efficient parameter fine-tuning, significantly enhance model utility on downstream tasks. However, they also pose substantial risks of unauthorized imitation and compromised data provenance for high-value unstructured data assets like creative works. Current copyright protection methods for creative works predominantly focus on visual arts, leaving a critical and unaddressed data engineering challenge in the safeguarding of creative writing. In this paper, we propose WIND (Watermarking via Implicit and Non-disruptive Disentanglement), a novel zero-watermarking, verifiable and implicit scheme that safeguards creative writing databases by providing verifiable copyright protection. Specifically, we decompose creative essence into five key elements, which are extracted utilizing LLMs through a designed instance delimitation mechanism and consolidated into condensed-lists. These lists enable WIND to convert core copyright attributes into verifiable watermarks via implicit encoding within a disentanglement creative space, where 'disentanglement' refers to the separation of creative-specific and creative-irrelevant features. This approach, utilizing implicit encoding, avoids distorting fragile textual content. Extensive experiments demonstrate that WIND effectively verifies creative writing copyright ownership against AI imitation, achieving F1 scores above 98% and maintaining robust performance under stringent low false-positive rates where existing state-of-the-art text watermarking methods struggle.

翻译：针对大型语言模型（LLM）的高效知识注入方法（如上下文学习、知识编辑与高效参数微调）显著提升了模型在下游任务中的实用性。然而，这些方法也对创意作品等高价值非结构化数据资产带来了未经授权的模仿与数据溯源受损的重大风险。当前针对创意作品的版权保护方法主要集中于视觉艺术领域，导致创意写作保护中存在一个关键且尚未解决的数据工程挑战。本文提出WIND（基于隐式非破坏性解缠结的水印技术），这是一种新颖的零水印、可验证且隐式的方案，通过提供可验证的版权保护来保障创意写作数据库的安全。具体而言，我们将创意本质解构为五个关键要素，通过设计的实例界定机制利用LLM进行提取，并整合为浓缩列表。这些列表使WIND能够通过解缠结创意空间中的隐式编码，将核心版权属性转化为可验证的水印——其中“解缠结”指代创意相关特征与创意无关特征的分离。这种采用隐式编码的方法避免了对脆弱文本内容的扭曲。大量实验表明，WIND能有效验证创意写作版权所有权以抵御AI模仿，其F1分数超过98%，并在现有先进文本水印方法难以应对的严格低误报率条件下保持稳健性能。

相关内容

关注 7103

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

【新书】大规模语言模型的隐私与安全，

专知会员服务

29+阅读 · 2024年12月4日

大语言模型中的提示隐私保护

专知会员服务

24+阅读 · 2024年7月24日

【新书】生成式AI的提示工程：为可靠的AI输出提供面向未来的输入

专知会员服务

68+阅读 · 2024年6月10日

【UIUC博士论文】迈向可信的大型语言模型，312页pdf

专知会员服务

41+阅读 · 2024年6月8日