Smaller Language Models are Better Black-box Machine-Generated Text Detectors

With the advent of fluent generative language models that can produce convincing utterances very similar to those written by humans, distinguishing whether a piece of text is machine-generated or human-written becomes more challenging and more important, as such models could be used to spread misinformation, fake news, fake reviews and to mimic certain authors and figures. To this end, there have been a slew of methods proposed to detect machine-generated text. Most of these methods need access to the logits of the target model or need the ability to sample from the target. One such black-box detection method relies on the observation that generated text is locally optimal under the likelihood function of the generator, while human-written text is not. We find that overall, smaller and partially-trained models are better universal text detectors: they can more precisely detect text generated from both small and larger models. Interestingly, we find that whether the detector and generator were trained on the same data is not critically important to the detection success. For instance the OPT-125M model has an AUC of 0.81 in detecting ChatGPT generations, whereas a larger model from the GPT family, GPTJ-6B, has AUC of 0.45.

翻译：随着流畅生成式语言模型的出现，它们能生成与人类书写极为相似的令人信服的话语，区分一段文本是机器生成还是人类撰写的任务变得更具挑战性和重要性，因为此类模型可能被用于传播错误信息、虚假新闻、虚假评论，或模仿特定作者及其风格。为此，学界已提出多种检测机器生成文本的方法。多数方法需访问目标模型的逻辑值或具备从目标采样的能力。一种基于黑箱的检测方法依赖于观察：生成文本在生成器的似然函数下是局部最优的，而人类撰写的文本则不然。我们发现，总体而言，更小且部分训练完成的模型是更好的通用文本检测器：它们能更精确地检测由小型及大型模型生成的文本。有趣的是，检测器与生成器是否在同一数据上训练对检测成功并非至关重要。例如，OPT-125M模型检测ChatGPT生成文本时的AUC为0.81，而来自GPT系列的大型模型GPTJ-6B的AUC仅为0.45。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

37+阅读 · 2019年10月17日