Multiscale Positive-Unlabeled Detection of AI-Generated Texts

Recent releases of Large Language Models (LLMs), e.g. ChatGPT, are astonishing at generating human-like texts, but they may get misused for fake scholarly texts, fake news, fake tweets, et cetera. Previous works have proposed methods to detect these multiscale AI-generated texts, including simple ML classifiers, pretrained-model-based training-agnostic methods, and finetuned language classification models. However, mainstream detectors are formulated without considering the factor of corpus length: shorter corpuses are harder to detect compared with longer ones for shortage of informative features. In this paper, a Multiscale Positive-Unlabeled (MPU) training framework is proposed to address the challenge of multiscale text detection. Firstly, we acknowledge the human-resemblance property of short machine texts, and rephrase text classification as a Positive-Unlabeled (PU) problem by marking these short machine texts as "unlabeled" during training. In this PU context, we propose the length-sensitive Multiscale PU Loss, where we use a recurrent model in abstraction to estimate positive priors of scale-variant corpuses. Additionally, we introduce a Text Multiscaling module to enrich training corpuses. Experiments show that our MPU method augments detection performance on long AI-generated text, and significantly improves short-corpus detection of language model detectors. Language Models trained with MPU could outcompete existing detectors by large margins on multiscale AI-generated texts. The codes are available at https://github.com/mindspore-lab/mindone/tree/master/examples/detect_chatgpt and https://github.com/huawei-noah/Efficient-Computing/AIGC_text_detector.

翻译：近期发布的大型语言模型（LLMs），如ChatGPT，在生成类人文本方面表现惊人，但它们可能被滥用于生成虚假学术文本、假新闻、虚假推文等。已有研究提出了检测这些多尺度AI生成文本的方法，包括简单的机器学习分类器、基于预训练模型的训练无关方法以及微调的语言分类模型。然而，主流检测器在构建时未考虑语料长度因素：相较于长语料，短语料因信息特征不足而更难检测。本文提出了一种多尺度正-无标记（MPU）训练框架，以应对多尺度文本检测的挑战。首先，我们承认短机器文本具有类人特性，并通过在训练中将这类短机器文本标记为“无标记”数据，将文本分类重新表述为正-无标记（PU）问题。在此PU框架下，我们提出了长度敏感的多尺度PU损失函数，其中利用抽象循环模型估计尺度变化语料的正先验概率。此外，我们引入了一个文本多尺度模块以丰富训练语料。实验表明，我们的MPU方法增强了对长AI生成文本的检测性能，并显著提升了语言模型检测器对短语料的检测效果。使用MPU训练的语言模型在多尺度AI生成的文本上以较大优势超越了现有检测器。代码已开源至 https://github.com/mindspore-lab/mindone/tree/master/examples/detect_chatgpt 和 https://github.com/huawei-noah/Efficient-Computing/AIGC_text_detector。