Currently, text watermarking algorithms for large language models (LLMs) can embed hidden features to texts generated by LLMs to facilitate subsequent detection, thus alleviating the problem of misuse of LLMs. Although the current text watermarking algorithms perform well in most high-entropy scenarios, its performance in low-entropy scenarios still needs to be improved. In this work, we proposed that the influence of token entropy should be fully considered in the watermark detection process, that is, the weight of each token during watermark detection should be adjusted according to its entropy, rather than setting the weights of all tokens to the same value as in previous methods. Specifically, we proposed an Entropy-based Watermark Detection (EWD) that gives higher-entropy tokens higher influence weights during watermark detection, so as to better reflect the degree of watermarking. Furthermore, the proposed detection process is training-free and fully automated. In the experiment, we found that our method can achieve better detection performance in low-entropy scenarios, and our method is also general and can be applied to texts with different entropy distributions. Our code and data will be available online.
翻译:当前,面向大型语言模型(LLMs)的文本水印算法能够将隐藏特征嵌入至LLMs生成的文本中,以支持后续检测,从而缓解LLMs被滥用的问题。尽管现有文本水印算法在多数高熵场景下表现良好,但其在低熵场景中的性能仍有待提升。本研究提出,水印检测过程中应充分考虑词元熵值的影响,即根据各词元的熵值调整其在检测过程中的权重,而非沿用此前方法将所有词元权重设为统一值。具体而言,我们提出了一种基于熵的水印检测方法(EWD),该方法在检测时赋予高熵词元更高的影响权重,从而更准确地反映水印嵌入程度。此外,所提出的检测流程无需训练且完全自动化。实验表明,本方法能在低熵场景下实现更优的检测性能,同时具有较强的通用性,可适用于不同熵分布的文本。相关代码与数据将在线公开。