The capabilities of text generators have grown with the rapid development of Large Language Models (LLM). To prevent potential misuse, the ability to detect whether texts are produced by LLM has become increasingly important. Several related works have attempted to solve this problem using binary classifiers that categorize input text as human-written or LLM-generated. However, these classifiers have been shown to be unreliable. As impactful decisions could be made based on the result of the classification, the text source detection needs to be high-quality. To this end, this paper presents DeepTextMark, a deep learning-based text watermarking method for text source detection. Applying Word2Vec and Sentence Encoding for watermark insertion and a transformer-based classifier for watermark detection, DeepTextMark achieves blindness, robustness, imperceptibility, and reliability simultaneously. As discussed further in the paper, these traits are indispensable for generic text source detection, and the application focus of this paper is on the text generated by LLM. DeepTextMark can be implemented as an "add-on" to existing text generation systems. That is, the method does not require access or modification to the text generation technique. Experiments have shown high imperceptibility, high detection accuracy, enhanced robustness, reliability, and fast running speed of DeepTextMark.
翻译:随着大语言模型(LLM)的快速发展,文本生成器的能力日益增强。为防止潜在滥用,检测文本是否由LLM生成的能力变得愈发重要。已有相关工作尝试使用二元分类器将输入文本归类为人类撰写或LLM生成来解决此问题,然而这些分类器被证明并不可靠。由于分类结果可能影响重大决策,文本来源检测需具备高质量。为此,本文提出DeepTextMark——一种基于深度学习的文本水印方法,用于文本来源检测。通过应用Word2Vec与句子编码进行水印嵌入,并采用基于Transformer的分类器进行水印检测,DeepTextMark同时实现了盲检测、鲁棒性、不可感知性和可靠性。正如文中进一步讨论,这些特性对于通用文本来源检测不可或缺,而本文的应用重点在于LLM生成的文本。DeepTextMark可作为现有文本生成系统的"附加模块"实现,即该方法无需访问或修改文本生成技术。实验表明,DeepTextMark具有高不可感知性、高检测精度、增强的鲁棒性、可靠性以及快速运行速度。