DetectRL：真实场景下大语言模型生成文本检测的基准测试 (DetectRL: Benchmarking LLM-Generated Text Detection in Real-World Scenarios)

Detecting text generated by large language models (LLMs) is of great recent interest. With zero-shot methods like DetectGPT, detection capabilities have reached impressive levels. However, the reliability of existing detectors in real-world applications remains underexplored. In this study, we present a new benchmark, DetectRL, highlighting that even state-of-the-art (SOTA) detection techniques still underperformed in this task. We collected human-written datasets from domains where LLMs are particularly prone to misuse. Using popular LLMs, we generated data that better aligns with real-world applications. Unlike previous studies, we employed heuristic rules to create adversarial LLM-generated text, simulating various prompts usages, human revisions like word substitutions, and writing noises like spelling mistakes. Our development of DetectRL reveals the strengths and limitations of current SOTA detectors. More importantly, we analyzed the potential impact of writing styles, model types, attack methods, the text lengths, and real-world human writing factors on different types of detectors. We believe DetectRL could serve as an effective benchmark for assessing detectors in real-world scenarios, evolving with advanced attack methods, thus providing more stressful evaluation to drive the development of more efficient detectors. Data and code are publicly available at: https://github.com/NLP2CT/DetectRL.

翻译：检测大语言模型（LLMs）生成的文本是近期备受关注的研究方向。借助DetectGPT等零样本方法，检测能力已达到令人印象深刻的水平。然而，现有检测器在实际应用中的可靠性仍未得到充分探索。本研究提出了一个新的基准测试DetectRL，结果表明即使是最先进的检测技术在此任务中仍表现欠佳。我们从大语言模型易被滥用的领域收集了人工撰写的数据集，并利用主流大语言模型生成了更贴合实际应用场景的数据。与以往研究不同，我们采用启发式规则创建对抗性大语言模型生成文本，模拟了包括多样化提示词使用、人工修订（如词汇替换）以及书写噪声（如拼写错误）等多种真实场景。通过构建DetectRL，我们揭示了当前最先进检测器的优势与局限。更重要的是，我们分析了写作风格、模型类型、攻击方法、文本长度及真实人工书写因素对不同类型检测器的潜在影响。我们相信DetectRL能够作为评估真实场景检测器的有效基准，并随着先进攻击方法的发展而演进，从而提供更具挑战性的评估以推动更高效检测器的研发。数据与代码已公开于：https://github.com/NLP2CT/DetectRL。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日