AI Content Self-Detection for Transformer-based Large Language Models

$ $The usage of generative artificial intelligence (AI) tools based on large language models, including ChatGPT, Bard, and Claude, for text generation has many exciting applications with the potential for phenomenal productivity gains. One issue is authorship attribution when using AI tools. This is especially important in an academic setting where the inappropriate use of generative AI tools may hinder student learning or stifle research by creating a large amount of automatically generated derivative work. Existing plagiarism detection systems can trace the source of submitted text but are not yet equipped with methods to accurately detect AI-generated text. This paper introduces the idea of direct origin detection and evaluates whether generative AI systems can recognize their output and distinguish it from human-written texts. We argue why current transformer-based models may be able to self-detect their own generated text and perform a small empirical study using zero-shot learning to investigate if that is the case. Results reveal varying capabilities of AI systems to identify their generated text. Google's Bard model exhibits the largest capability of self-detection with an accuracy of 94\%, followed by OpenAI's ChatGPT with 83\%. On the other hand, Anthropic's Claude model seems to be not able to self-detect.

翻译：基于大型语言模型的生成式人工智能（AI）工具（包括ChatGPT、Bard和Claude）在文本生成方面具有许多令人兴奋的应用，并有望带来显著的生产力提升。然而，使用AI工具时的一个关键问题是作者归属。这在学术环境中尤为重要，因为不当使用生成式AI工具可能阻碍学生学习，或通过生成大量自动衍生的作品而抑制研究创新。现有的抄袭检测系统可以追踪提交文本的来源，但尚未配备能够准确检测AI生成文本的方法。本文提出了直接来源检测的概念，并评估了生成式AI系统能否识别自身生成的内容，并将其与人类撰写的文本区分开来。我们论证了当前基于Transformer的模型为何可能具备自检测其自身生成文本的能力，并通过零样本学习进行了一项小型实证研究以验证这一假设。结果表明，不同AI系统在识别自身生成文本方面的能力存在差异。谷歌的Bard模型展现出最强的自检测能力，准确率达94%，其次是OpenAI的ChatGPT，准确率为83%。相比之下，Anthropic的Claude模型似乎无法实现自检测。

相关内容

TOOLS

关注 1

这个新版本的工具会议系列恢复了从1989年到2012年的50个会议的传统。工具最初是“面向对象语言和系统的技术”，后来发展到包括软件技术的所有创新方面。今天许多最重要的软件概念都是在这里首次引入的。2019年TOOLS 50+1在俄罗斯喀山附近举行，以同样的创新精神、对所有与软件相关的事物的热情、科学稳健性和行业适用性的结合以及欢迎该领域所有趋势和社区的开放态度，延续了该系列。官网链接：http://tools2019.innopolis.ru/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日