PersonaMark: Personalized LLM watermarking for model protection and user attribution

The rapid advancement of customized Large Language Models (LLMs) offers considerable convenience. However, it also intensifies concerns regarding the protection of copyright/confidential information. With the extensive adoption of private LLMs, safeguarding model copyright and ensuring data privacy have become critical. Text watermarking has emerged as a viable solution for detecting AI-generated content and protecting models. However, existing methods fall short in providing individualized watermarks for each user, a critical feature for enhancing accountability and traceability. In this paper, we introduce PersonaMark, a novel personalized text watermarking scheme designed to protect LLMs' copyrights and bolster accountability. PersonaMark leverages sentence structure as a subtle carrier of watermark information and optimizes the generation process to maintain the natural output of the model. By employing a personalized hashing function, unique watermarks are embedded for each user, enabling high-quality text generation without compromising the model's performance. This approach is both time-efficient and scalable, capable of handling large numbers of users through a multi-user hashing mechanism. To the best of our knowledge, this is a pioneer study to explore personalized watermarking in LLMs. We conduct extensive evaluations across four LLMs, analyzing various metrics such as perplexity, sentiment, alignment, and readability. The results validate that PersonaMark preserves text quality, ensures unbiased watermark insertion, and offers robust watermark detection capabilities, all while maintaining the model's behavior with minimal disruption.

翻译：定制化大语言模型的快速发展带来了显著便利，但也加剧了版权/机密信息保护的担忧。随着私有化大语言模型的广泛采用，保护模型版权与确保数据隐私已成为关键议题。文本水印技术作为检测AI生成内容与保护模型的有效解决方案应运而生。然而，现有方法无法为每位用户提供个性化水印，这一特性对于增强问责与溯源能力至关重要。本文提出PersonaMark——一种新颖的个性化文本水印方案，旨在保护大语言模型版权并强化问责机制。PersonaMark以句子结构作为水印信息的隐蔽载体，通过优化生成过程保持模型的自然输出特性。该方案采用个性化哈希函数为每位用户嵌入唯一水印，在保证文本生成质量的同时不损害模型性能。该方法具备时间高效性与可扩展性，能通过多用户哈希机制处理海量用户。据我们所知，这是首个探索大语言模型个性化水印的开拓性研究。我们在四种大语言模型上开展全面评估，分析了困惑度、情感倾向、对齐度及可读性等多维度指标。实验结果验证了PersonaMark在保持文本质量、确保无偏水印嵌入、提供鲁棒水印检测能力方面的有效性，且对模型行为的干扰达到最小化。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【AAAI2022】面向多标签分类的端到端概率标签特征学习

专知会员服务

32+阅读 · 2022年1月27日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日