Currently, text watermarking algorithms for large language models (LLMs) can embed hidden features to texts generated by LLMs to facilitate subsequent detection, thus alleviating the problem of misuse of LLMs. Although the current text watermarking algorithms perform well in most high-entropy scenarios, its performance in low-entropy scenarios still needs to be improved. In this work, we proposed that the influence of token entropy should be fully considered in the watermark detection process, that is, the weight of each token should be adjusted according to its entropy during watermark detection, rather than setting the weight of all tokens to the same value as in previous methods. Specifically, we proposed an Entropy-based Watermark Detection (EWD) that gives higher-entropy tokens higher weights during watermark detection, so as to better reflect the degree of watermarking. Furthermore, the proposed detection process is training-free and fully automated. %In actual detection, we use a proxy-LLM to calculate the entropy of each token, without the need to use the original LLM. In the experiment, we found that our method can achieve better detection performance in low-entropy scenarios, and our method is also general and can be applied to texts with different entropy distributions. Our code and data will be available online.
翻译:目前,针对大语言模型的文本水印算法能够在其生成的文本中嵌入隐藏特征,以便后续检测,从而缓解LLMs被滥用的风险。尽管现有文本水印算法在高熵场景下表现良好,但在低熵场景中的性能仍有待提升。本文提出在水印检测过程中应充分考虑令牌熵的影响,即根据每个令牌的熵值动态调整其权重,而非像先前方法那样将所有令牌权重设为统一值。具体而言,我们提出了一种基于熵的文本水印检测(EWD)方法,该方法在检测时赋予高熵令牌更高权重,以更精准地反映水印嵌入程度。此外,该检测过程无需训练且完全自动化。实验表明,我们的方法在低熵场景中能够实现更优的检测性能,且具有通用性,可适用于不同熵分布的文本。相关代码与数据将在线公开。