Evade ChatGPT Detectors via A Single Space

ChatGPT brings revolutionary social value but also raises concerns about the misuse of AI-generated text. Consequently, an important question is how to detect whether texts are generated by ChatGPT or by human. Existing detectors are built upon the assumption that there are distributional gaps between human-generated and AI-generated text. These gaps are typically identified using statistical information or classifiers. Our research challenges the distributional gap assumption in detectors. We find that detectors do not effectively discriminate the semantic and stylistic gaps between human-generated and AI-generated text. Instead, the "subtle differences", such as an extra space, become crucial for detection. Based on this discovery, we propose the SpaceInfi strategy to evade detection. Experiments demonstrate the effectiveness of this strategy across multiple benchmarks and detectors. We also provide a theoretical explanation for why SpaceInfi is successful in evading perplexity-based detection. And we empirically show that a phenomenon called token mutation causes the evasion for language model-based detectors. Our findings offer new insights and challenges for understanding and constructing more applicable ChatGPT detectors.

翻译：ChatGPT带来了革命性的社会价值，但也引发了对AI生成文本滥用的担忧。因此，一个重要问题是如何检测文本是由ChatGPT生成还是由人类撰写。现有检测器基于一个假设：人类生成文本与AI生成文本之间存在分布差异。这些差异通常通过统计信息或分类器来识别。我们的研究挑战了检测器中的分布差异假设。我们发现，检测器并未有效区分人类生成文本与AI生成文本之间的语义和风格差异。相反，“细微差异”（例如一个额外的空格）对检测至关重要。基于这一发现，我们提出了SpaceInfi策略来规避检测。实验表明，该策略在多个基准测试和检测器上均有效。我们还从理论上解释了为何SpaceInfi能够成功规避基于困惑度的检测，并通过实验证明，一种称为“令牌突变”的现象导致了基于语言模型的检测器失效。我们的发现为理解和构建更适用的ChatGPT检测器提供了新见解与挑战。

相关内容

ChatGPT

关注 258

ChatGPT（全名：Chat Generative Pre-trained Transformer），美国OpenAI 研发的聊天机器人程序 [1] ，于2022年11月30日发布。ChatGPT是人工智能技术驱动的自然语言处理工具，它能够通过学习和理解人类的语言来进行对话，还能根据聊天的上下文进行互动，真正像人类一样来聊天交流，甚至能完成撰写邮件、视频脚本、文案、翻译、代码，写论文任务。 [1] https://openai.com/blog/chatgpt/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日