As natural language models like ChatGPT become increasingly prevalent in applications and services, the need for robust and accurate methods to detect their output is of paramount importance. In this paper, we present GPT Reddit Dataset (GRiD), a novel Generative Pretrained Transformer (GPT)-generated text detection dataset designed to assess the performance of detection models in identifying generated responses from ChatGPT. The dataset consists of a diverse collection of context-prompt pairs based on Reddit, with human-generated and ChatGPT-generated responses. We provide an analysis of the dataset's characteristics, including linguistic diversity, context complexity, and response quality. To showcase the dataset's utility, we benchmark several detection methods on it, demonstrating their efficacy in distinguishing between human and ChatGPT-generated responses. This dataset serves as a resource for evaluating and advancing detection techniques in the context of ChatGPT and contributes to the ongoing efforts to ensure responsible and trustworthy AI-driven communication on the internet. Finally, we propose GpTen, a novel tensor-based GPT text detection method that is semi-supervised in nature since it only has access to human-generated text and performs on par with fully-supervised baselines.
翻译:随着ChatGPT等自然语言模型在应用和服务中日益普及,开发稳健且准确的输出来检测方法变得至关重要。本文提出GPT Reddit数据集(GRiD),这是一个新颖的基于生成式预训练Transformer(GPT)的文本检测数据集,旨在评估检测模型识别ChatGPT生成回复的性能。该数据集包含基于Reddit的多样化上下文-提示对,涵盖人类生成和ChatGPT生成的回复。我们分析了数据集的特性,包括语言多样性、上下文复杂性和回复质量。为展示该数据集的实用性,我们对其上多种检测方法进行了基准测试,验证了它们在区分人类与ChatGPT生成回复方面的有效性。该数据集为评估和推进ChatGPT相关检测技术提供了资源,并有助于确保互联网上负责任且可信的AI驱动通信的持续努力。最后,我们提出GpTen——一种新颖的基于张量的GPT文本检测方法,该方法本质上是半监督的,因为它仅能访问人类生成的文本,但其性能可与全监督基线方法媲美。