The rapid advancement of Large Language Models (LLMs) has significantly enhanced the capabilities of text generators. With the potential for misuse escalating, the importance of discerning whether texts are human-authored or generated by LLMs has become paramount. Several preceding studies have ventured to address this challenge by employing binary classifiers to differentiate between human-written and LLM-generated text. Nevertheless, the reliability of these classifiers has been subject to question. Given that consequential decisions may hinge on the outcome of such classification, it is imperative that text source detection is of high caliber. In light of this, the present paper introduces DeepTextMark, a deep learning-driven text watermarking methodology devised for text source identification. By leveraging Word2Vec and Sentence Encoding for watermark insertion, alongside a transformer-based classifier for watermark detection, DeepTextMark epitomizes a blend of blindness, robustness, imperceptibility, and reliability. As elaborated within the paper, these attributes are crucial for universal text source detection, with a particular emphasis in this paper on text produced by LLMs. DeepTextMark offers a viable "add-on" solution to prevailing text generation frameworks, requiring no direct access or alterations to the underlying text generation mechanism. Experimental evaluations underscore the high imperceptibility, elevated detection accuracy, augmented robustness, reliability, and swift execution of DeepTextMark.
翻译:大语言模型的快速发展显著提升了文本生成器的能力。随着其潜在滥用风险的加剧,判断文本是由人类撰写还是由大语言模型生成的重要性日益凸显。此前多项研究尝试通过二元分类器区分人类撰写文本与大语言模型生成文本,但这些分类器的可靠性仍存疑问。鉴于此类分类结果可能影响重要决策,文本来源检测需具备高质量。鉴于此,本文提出DeepTextMark——一种基于深度学习的文本水印方法,旨在识别文本来源。该方法通过利用Word2Vec与句子编码进行水印嵌入,并基于Transformer架构的分类器实现水印检测,兼具盲检测、鲁棒性、不可察觉性与可靠性。本文详细阐述了这些特性对于通用文本来源检测的关键作用,并特别聚焦于大语言模型生成的文本。DeepTextMark为现有文本生成框架提供了一种可行的“附加”解决方案,无需直接访问或修改底层文本生成机制。实验评估表明,DeepTextMark在高度不可察觉性、检测精度提升、鲁棒性增强、可靠性保障及快速执行方面均表现优异。