This study explores linguistic differences between human and LLM-generated dialogues, using 19.5K dialogues generated by ChatGPT-3.5 as a companion to the EmpathicDialogues dataset. The research employs Linguistic Inquiry and Word Count (LIWC) analysis, comparing ChatGPT-generated conversations with human conversations across 118 linguistic categories. Results show greater variability and authenticity in human dialogues, but ChatGPT excels in categories such as social processes, analytical style, cognition, attentional focus, and positive emotional tone, reinforcing recent findings of LLMs being "more human than human." However, no significant difference was found in positive or negative affect between ChatGPT and human dialogues. Classifier analysis of dialogue embeddings indicates implicit coding of the valence of affect despite no explicit mention of affect in the conversations. The research also contributes a novel, companion ChatGPT-generated dataset of conversations between two independent chatbots, which were designed to replicate a corpus of human conversations available for open access and used widely in AI research on language modeling. Our findings increase understanding of ChatGPT's linguistic capabilities and inform ongoing efforts to distinguish between human and LLM-generated text, which is critical in detecting AI-generated fakes, misinformation, and disinformation.
翻译:本研究以EmpathicDialogues数据集为参照,通过ChatGPT-3.5生成19.5K对话,系统探究人类与大型语言模型生成对话间的语言学差异。采用语言查询与词频统计(LIWC)分析法,从118个语言范畴对ChatGPT生成对话与人类对话进行对比。结果表明,人类对话在变异性与真实性方面表现更优,而ChatGPT在社交过程、分析性思维、认知特征、注意力聚焦及积极情绪基调等类别中表现突出,进一步印证了近期关于大语言模型"比人类更人性化"的研究发现。然而,在正负面情感维度上,ChatGPT与人类对话未呈现显著差异。对话嵌入的分类分析显示,尽管对话中未明确提及情感指向,但蕴含隐性情感效价编码。本研究还贡献了一个创新性的配套数据集——由两个独立聊天机器人生成的ChatGPT对话,该数据集旨在复现一个开放获取且广泛应用于AI语言建模研究的人类对话语料库。这些发现深化了对ChatGPT语言能力的理解,并为当前区分人类与大语言模型生成文本的持续探索提供依据,对检测AI生成的虚假信息、误导性内容及不实信息具有关键作用。