To mitigate potential risks associated with language models, recent AI detection research proposes incorporating watermarks into machine-generated text through random vocabulary restrictions and utilizing this information for detection. While these watermarks only induce a slight deterioration in perplexity, our empirical investigation reveals a significant detriment to the performance of conditional text generation. To address this issue, we introduce a simple yet effective semantic-aware watermarking algorithm that considers the characteristics of conditional text generation and the input context. Experimental results demonstrate that our proposed method yields substantial improvements across various text generation models, including BART and Flan-T5, in tasks such as summarization and data-to-text generation while maintaining detection ability.
翻译:为减轻语言模型相关潜在风险,近期AI检测研究提出通过随机限制词表为机器生成文本添加水印,并利用该信息进行检测。尽管此类水印仅导致困惑度轻微劣化,但本实验研究揭示其对条件文本生成的性能造成显著损害。针对此问题,我们提出一种简单而有效的语义感知水印算法,该算法兼顾条件文本生成特性及输入上下文。实验结果表明,所提方法在摘要生成、数据到文本生成等任务中,对包括BART和Flan-T5在内的多种文本生成模型均能实现显著性能提升,同时保持检测能力。