Text watermarking has emerged as an important technique for detecting machine-generated text. However, existing methods can severely degrade text quality due to arbitrary vocabulary partitioning, which disrupts the language model's expressiveness and impedes textual coherence. To mitigate this, we introduce XMark, a novel approach that capitalizes on text redundancy within the lexical space. Specifically, XMark incorporates a mutually exclusive rule for synonyms during the language model decoding process, thereby integrating prior knowledge into vocabulary partitioning and preserving the capabilities of language generation. We present theoretical analyses and empirical evidence demonstrating that XMark substantially enhances text generation fluency while maintaining watermark detectability. Furthermore, we investigate watermarking's impact on the emergent abilities of large language models, including zero-shot and few-shot knowledge recall, logical reasoning, and instruction following. Our comprehensive experiments confirm that XMark consistently outperforms existing methods in retaining these crucial capabilities of LLMs.
翻译:文本水印已成为检测机器生成文本的重要技术。然而,现有方法因任意划分词汇表,破坏了语言模型的表达能力并阻碍文本连贯性,导致文本质量严重下降。为解决这一问题,我们提出XMark——一种利用词汇空间中文本冗余性的新型方法。具体而言,XMark在语言模型解码过程中引入同义词互斥规则,从而将先验知识融入词汇表划分,并保留语言生成能力。我们通过理论分析和实验证据表明,XMark在保持水印可检测性的同时,显著提升了文本生成的流畅性。此外,我们探究了水印对大语言模型涌现能力的影响,包括零样本与少样本知识召回、逻辑推理及指令遵循能力。全面实验证实,XMark在保留大语言模型这些关键能力方面始终优于现有方法。