To mitigate potential risks associated with language models, recent AI detection research proposes incorporating watermarks into machine-generated text through random vocabulary restrictions and utilizing this information for detection. While these watermarks only induce a slight deterioration in perplexity, our empirical investigation reveals a significant detriment to the performance of conditional text generation. To address this issue, we introduce a simple yet effective semantic-aware watermarking algorithm that considers the characteristics of conditional text generation and the input context. Experimental results demonstrate that our proposed method yields substantial improvements across various text generation models, including BART and Flan-T5, in tasks such as summarization and data-to-text generation while maintaining detection ability.
翻译:为缓解语言模型带来的潜在风险,近期AI检测研究提出通过随机词汇限制在机器生成文本中嵌入数字水印,并利用该信息进行检测。尽管此类水印仅对困惑度产生轻微影响,但本研究的实证调查揭示了其对条件文本生成性能的显著损害。针对此问题,我们提出一种简洁高效的语义感知水印算法,该算法充分考虑了条件文本生成特性及输入上下文。实验结果表明,所提方法在摘要生成与数据到文本生成等任务中,对包括BART和Flan-T5在内的多种文本生成模型均能带来显著性能提升,同时保持检测能力。