Text watermarks in large language models (LLMs) are increasingly used to detect synthetic text, mitigating misuse cases like fake news and academic dishonesty. While existing watermarking detection techniques primarily focus on classifying entire documents as watermarked or not, they often neglect the common scenario of identifying individual watermark segments within longer, mixed-source documents. Drawing inspiration from plagiarism detection systems, we propose two novel methods for partial watermark detection. First, we develop a geometry cover detection framework aimed at determining whether there is a watermark segment in long text. Second, we introduce an adaptive online learning algorithm to pinpoint the precise location of watermark segments within the text. Evaluated on three popular watermarking techniques (KGW-Watermark, Unigram-Watermark, and Gumbel-Watermark), our approach achieves high accuracy, significantly outperforming baseline methods. Moreover, our framework is adaptable to other watermarking techniques, offering new insights for precise watermark detection.
翻译:大型语言模型(LLM)中的文本水印技术正日益广泛地用于检测合成文本,从而有效遏制虚假新闻和学术不端等滥用行为。现有的水印检测技术主要侧重于将整个文档分类为是否含有水印,却往往忽略了在较长的混合来源文档中识别单个水印片段的常见需求。受抄袭检测系统的启发,我们提出了两种新颖的部分水印检测方法。首先,我们开发了一种几何覆盖检测框架,旨在判断长文本中是否存在水印片段。其次,我们引入了一种自适应在线学习算法,以精确定位文本中水印片段的具体位置。在三种主流水印技术(KGW-Watermark、Unigram-Watermark 和 Gumbel-Watermark)上的评估表明,我们的方法实现了高准确率,显著优于基线方法。此外,我们的框架可适配于其他水印技术,为精确水印检测提供了新的思路。