The rapid advancement of customized Large Language Models (LLMs) offers considerable convenience. However, it also intensifies concerns regarding the protection of copyright/confidential information. With the extensive adoption of private LLMs, safeguarding model copyright and ensuring data privacy have become critical. Text watermarking has emerged as a viable solution for detecting AI-generated content and protecting models. However, existing methods fall short in providing individualized watermarks for each user, a critical feature for enhancing accountability and traceability. In this paper, we introduce PersonaMark, a novel personalized text watermarking scheme designed to protect LLMs' copyrights and bolster accountability. PersonaMark leverages sentence structure as a subtle carrier of watermark information and optimizes the generation process to maintain the natural output of the model. By employing a personalized hashing function, unique watermarks are embedded for each user, enabling high-quality text generation without compromising the model's performance. This approach is both time-efficient and scalable, capable of handling large numbers of users through a multi-user hashing mechanism. To the best of our knowledge, this is a pioneer study to explore personalized watermarking in LLMs. We conduct extensive evaluations across four LLMs, analyzing various metrics such as perplexity, sentiment, alignment, and readability. The results validate that PersonaMark preserves text quality, ensures unbiased watermark insertion, and offers robust watermark detection capabilities, all while maintaining the model's behavior with minimal disruption.
翻译:定制化大语言模型的快速发展带来了显著便利,但也加剧了版权/机密信息保护的担忧。随着私有化大语言模型的广泛采用,保护模型版权与确保数据隐私已成为关键议题。文本水印技术作为检测AI生成内容与保护模型的有效解决方案应运而生。然而,现有方法无法为每位用户提供个性化水印,这一特性对于增强问责与溯源能力至关重要。本文提出PersonaMark——一种新颖的个性化文本水印方案,旨在保护大语言模型版权并强化问责机制。PersonaMark以句子结构作为水印信息的隐蔽载体,通过优化生成过程保持模型的自然输出特性。该方案采用个性化哈希函数为每位用户嵌入唯一水印,在保证文本生成质量的同时不损害模型性能。该方法具备时间高效性与可扩展性,能通过多用户哈希机制处理海量用户。据我们所知,这是首个探索大语言模型个性化水印的开拓性研究。我们在四种大语言模型上开展全面评估,分析了困惑度、情感倾向、对齐度及可读性等多维度指标。实验结果验证了PersonaMark在保持文本质量、确保无偏水印嵌入、提供鲁棒水印检测能力方面的有效性,且对模型行为的干扰达到最小化。