Large Language Models (LLMs) are widely used in complex natural language processing tasks but raise privacy and security concerns due to the lack of identity recognition. This paper proposes a multi-party credible watermarking framework (CredID) involving a trusted third party (TTP) and multiple LLM vendors to address these issues. In the watermark embedding stage, vendors request a seed from the TTP to generate watermarked text without sending the user's prompt. In the extraction stage, the TTP coordinates each vendor to extract and verify the watermark from the text. This provides a credible watermarking scheme while preserving vendor privacy. Furthermore, current watermarking algorithms struggle with text quality, information capacity, and robustness, making it challenging to meet the diverse identification needs of LLMs. Thus, we propose a novel multi-bit watermarking algorithm and an open-source toolkit to facilitate research. Experiments show our CredID enhances watermark credibility and efficiency without compromising text quality. Additionally, we successfully utilized this framework to achieve highly accurate identification among multiple LLM vendors.
翻译:大语言模型(LLMs)被广泛用于复杂的自然语言处理任务,但由于缺乏身份识别机制,引发了隐私和安全方面的担忧。本文提出了一种涉及可信第三方(TTP)和多个LLM供应商的多方可信水印框架(CredID)以解决这些问题。在水印嵌入阶段,供应商向TTP请求一个种子,用于生成带水印的文本,而无需发送用户的提示词。在提取阶段,TTP协调各供应商从文本中提取并验证水印。这提供了一种可信的水印方案,同时保护了供应商的隐私。此外,当前的水印算法在文本质量、信息容量和鲁棒性方面存在不足,难以满足LLMs多样化的识别需求。因此,我们提出了一种新颖的多比特水印算法和一个开源工具包以促进相关研究。实验表明,我们的CredID在不损害文本质量的前提下,增强了水印的可信度和效率。此外,我们成功利用该框架在多个LLM供应商之间实现了高精度的身份识别。