Large language models (LLMs) such as ChatGPT are increasingly being used for various use cases, including text content generation at scale. Although detection methods for such AI-generated text exist already, we investigate ChatGPT's performance as a detector on such AI-generated text, inspired by works that use ChatGPT as a data labeler or annotator. We evaluate the zero-shot performance of ChatGPT in the task of human-written vs. AI-generated text detection, and perform experiments on publicly available datasets. We empirically investigate if ChatGPT is symmetrically effective in detecting AI-generated or human-written text. Our findings provide insight on how ChatGPT and similar LLMs may be leveraged in automated detection pipelines by simply focusing on solving a specific aspect of the problem and deriving the rest from that solution. All code and data is available at https://github.com/AmritaBh/ChatGPT-as-Detector.
翻译:大型语言模型(LLM)如ChatGPT正被越来越多地应用于各类场景,包括大规模文本内容生成。尽管已有针对此类AI生成文本的检测方法,我们受ChatGPT作为数据标注或注释工具的研究启发,探讨了ChatGPT作为检测器在AI生成文本上的表现。我们评估了ChatGPT在人工撰写文本与AI生成文本检测任务中的零样本性能,并在公开数据集上进行了实验。通过实证研究,我们探究了ChatGPT在检测AI生成文本或人工撰写文本时是否具有对称有效性。研究结果表明,ChatGPT及类似的大型语言模型可通过仅聚焦于解决特定问题方面并由此推导其余部分,从而被有效应用于自动化检测流程。所有代码和数据均可在https://github.com/AmritaBh/ChatGPT-as-Detector获取。