From Text to Source: Results in Detecting Large Language Model-Generated Content

The widespread use of Large Language Models (LLMs), celebrated for their ability to generate human-like text, has raised concerns about misinformation and ethical implications. Addressing these concerns necessitates the development of robust methods to detect and attribute text generated by LLMs. This paper investigates "Cross-Model Detection," evaluating whether a classifier trained to distinguish between source LLM-generated and human-written text can also detect text from a target LLM without further training. The study comprehensively explores various LLM sizes and families, and assesses the impact of conversational fine-tuning techniques on classifier generalization. The research also delves into Model Attribution, encompassing source model identification, model family classification, and model size classification. Our results reveal several key findings: a clear inverse relationship between classifier effectiveness and model size, with larger LLMs being more challenging to detect, especially when the classifier is trained on data from smaller models. Training on data from similarly sized LLMs can improve detection performance from larger models but may lead to decreased performance when dealing with smaller models. Additionally, model attribution experiments show promising results in identifying source models and model families, highlighting detectable signatures in LLM-generated text. Overall, our study contributes valuable insights into the interplay of model size, family, and training data in LLM detection and attribution.

翻译：大语言模型（LLMs）以其生成类人文本的能力而备受赞誉，但其广泛应用引发了关于信息误传和伦理影响的担忧。为应对这些担忧，需要开发稳健的方法来检测和归因由LLMs生成的文本。本文研究了“跨模型检测”（Cross-Model Detection），评估一个经训练以区分源LLM生成文本与人类书写文本的分类器，是否也能在无需进一步训练的情况下检测目标LLM的文本。本研究全面探讨了不同规模和系列的LLM，并评估了对话微调技术对分类器泛化能力的影响。研究还深入探讨了模型归因（Model Attribution），涵盖源模型识别、模型系列分类和模型规模分类。我们的结果揭示了几个关键发现：分类器效能与模型规模之间存在明显的反比关系，即更大规模的LLM更难被检测，尤其是当分类器基于较小模型的数据训练时。基于相似规模LLM的数据训练可提升对较大模型的检测性能，但可能导致对较小模型的性能下降。此外，模型归因实验在识别源模型和模型系列方面展现了良好结果，凸显了LLM生成文本中的可检测特征。总体而言，我们的研究为理解模型规模、系列及训练数据在LLM检测与归因中的相互作用提供了宝贵见解。

相关内容

MoDELS

关注 46

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日