An Unforgeable Publicly Verifiable Watermark for Large Language Models

Recently, text watermarking algorithms for large language models (LLMs) have been proposed to mitigate the potential harms of text generated by LLMs, including fake news and copyright issues. However, current watermark detection algorithms require the secret key used in the watermark generation process, making them susceptible to security breaches and counterfeiting during public detection. To address this limitation, we propose an unforgeable publicly verifiable watermark algorithm that uses two different neural networks for watermark generation and detection, instead of using the same key at both stages. Meanwhile, the token embedding parameters are shared between the generation and detection networks, which makes the detection network achieve a high accuracy very efficiently. Experiments demonstrate that our algorithm attains high detection accuracy and computational efficiency through neural networks with a minimized number of parameters. Subsequent analysis confirms the high complexity involved in forging the watermark from the detection network. Our code and data are available at \href{https://github.com/THU-BPM/unforgeable_watermark}{https://github.com/THU-BPM/unforgeable\_watermark}.

翻译：近期，针对大语言模型（LLMs）的文本水印算法被提出，旨在减轻由LLMs生成的文本（包括假新闻和版权问题）的潜在危害。然而，现有水印检测算法需要使用水印生成过程中的密钥，导致其在公开检测时易受安全攻击和伪造。为解决这一局限性，我们提出一种不可伪造的公开可验证水印算法，该算法使用两个不同的神经网络分别进行水印生成和检测，而非在两个阶段采用相同密钥。同时，生成网络与检测网络共享token嵌入参数，使检测网络能够高效实现高精度检测。实验表明，通过参数最小化的神经网络，我们的算法实现了高检测精度和计算效率。后续分析证实了从检测网络中伪造水印具有极高的复杂性。我们的代码与数据开源在 https://github.com/THU-BPM/unforgeable_watermark。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日