Large language models (LLMs) excellently generate human-like text, but also raise concerns about misuse in fake news and academic dishonesty. Decoding-based watermark, particularly the GumbelMax-trick-based watermark(GM watermark), is a standout solution for safeguarding machine-generated texts due to its notable detectability. However, GM watermark encounters a major challenge with generation diversity, always yielding identical outputs for the same prompt, negatively impacting generation diversity and user experience. To overcome this limitation, we propose a new type of GM watermark, the Logits-Addition watermark, and its three variants, specifically designed to enhance diversity. Among these, the GumbelSoft watermark (a softmax variant of the Logits-Addition watermark) demonstrates superior performance in high diversity settings, with its AUROC score outperforming those of the two alternative variants by 0.1 to 0.3 and surpassing other decoding-based watermarking methods by a minimum of 0.1.
翻译:大型语言模型能够出色地生成类似人类的文本,但也引发了对虚假新闻和学术不端行为滥用风险的担忧。基于解码的水印方法,特别是基于GumbelMax技巧的水印(GM水印),因其显著的可检测性,成为保护机器生成文本的突出解决方案。然而,GM水印在生成多样性方面面临重大挑战,总是对同一提示生成相同输出,对生成多样性和用户体验产生负面影响。为克服这一局限,我们提出了一种新型GM水印——对数加法水印及其三种变体,专门设计用于增强多样性。其中,GumbelSoft水印(对数加法水印的softmax变体)在高多样性场景下表现出优越性能,其AUROC评分比两种替代变体高0.1至0.3,且比其他基于解码的水印方法至少高出0.1。