The rapid advancements in large language models (LLMs) have significantly improved their ability to generate natural language, making texts generated by LLMs increasingly indistinguishable from human-written texts. Recent research has predominantly focused on using LLMs to classify text as either human-written or machine-generated. In our study, we adopt a different approach by profiling texts spanning four domains based on 250 distinct linguistic features. We select the M4 dataset from the Subtask B of SemEval 2024 Task 8. We automatically calculate various linguistic features with the LFTK tool and additionally measure the average syntactic depth, semantic similarity, and emotional content for each document. We then apply a two-dimensional PCA reduction to all the calculated features. Our analyses reveal significant differences between human-written texts and those generated by LLMs, particularly in the variability of these features, which we find to be considerably higher in human-written texts. This discrepancy is especially evident in text genres with less rigid linguistic style constraints. Our findings indicate that humans write texts that are less cognitively demanding, with higher semantic content, and richer emotional content compared to texts generated by LLMs. These insights underscore the need for incorporating meaningful linguistic features to enhance the understanding of textual outputs of LLMs.
翻译:大型语言模型(LLMs)的快速发展显著提升了其生成自然语言的能力,使得LLM生成的文本与人类撰写的文本越来越难以区分。近期研究主要集中于利用LLMs对文本进行人工撰写或机器生成的分类。在本研究中,我们采用了一种不同的方法,基于250个不同的语言特征对涵盖四个领域的文本进行特征分析。我们选取了SemEval 2024任务8子任务B中的M4数据集,使用LFTK工具自动计算多种语言特征,并额外测量了每篇文档的平均句法深度、语义相似度和情感内容。随后,我们对所有计算得到的特征应用二维PCA降维。分析结果表明,人类撰写的文本与LLM生成的文本之间存在显著差异,特别是在这些特征的变异性方面——我们发现人类撰写文本的变异性明显更高。这种差异在语言风格约束较弱的文本类型中尤为明显。我们的研究结果表明,与LLM生成的文本相比,人类撰写的文本具有更低的认知负荷、更高的语义含量以及更丰富的情感内容。这些发现强调了引入有意义的语言特征对于深化理解LLM文本输出的必要性。