This study explores linguistic differences between human and LLM-generated dialogues, using 19.5K dialogues generated by ChatGPT-3.5 as a companion to the EmpathicDialogues dataset. The research employs Linguistic Inquiry and Word Count (LIWC) analysis, comparing ChatGPT-generated conversations with human conversations across 118 linguistic categories. Results show greater variability and authenticity in human dialogues, but ChatGPT excels in categories such as social processes, analytical style, cognition, attentional focus, and positive emotional tone, reinforcing recent findings of LLMs being "more human than human." However, no significant difference was found in positive or negative affect between ChatGPT and human dialogues. Classifier analysis of dialogue embeddings indicates implicit coding of the valence of affect despite no explicit mention of affect in the conversations. The research also contributes a novel, companion ChatGPT-generated dataset of conversations between two independent chatbots, which were designed to replicate a corpus of human conversations available for open access and used widely in AI research on language modeling. Our findings increase understanding of ChatGPT's linguistic capabilities and inform ongoing efforts to distinguish between human and LLM-generated text, which is critical in detecting AI-generated fakes, misinformation, and disinformation.
翻译:本研究以EmpathicDialogues数据集为基准,利用ChatGPT-3.5生成的19.5K段对话,探索人类与大型语言模型(LLM)生成对话之间的语言学差异。通过采用语言查询与词频计数(LIWC)分析,对ChatGPT生成对话与人类对话在118个语言类别维度上进行比较。结果表明人类对话具有更强的变异性和真实性,但ChatGPT在社交过程、分析性风格、认知能力、注意力聚焦及积极情感基调等类别中表现更优,这佐证了近期关于LLM"比人类更似人类"的研究发现。然而,ChatGPT与人类对话在积极或消极情感维度上未呈现显著差异。对话嵌入的分类器分析显示,尽管对话中未明确提及情感,但存在情感效价的隐含编码。本研究还贡献了一个创新的配套数据集,其中包含两个独立聊天机器人之间的ChatGPT生成对话,该数据集旨在复刻一个开放获取且广泛用于AI语言建模研究的人类对话语料库。我们的发现深化了对ChatGPT语言能力的认知,并为持续开展人类与LLM生成文本区分工作提供依据,这对于检测AI生成伪内容、虚假信息及误导性信息具有关键意义。