Recent advances in large language models (LLM) have the potential to shed light on the debate regarding the extent to which knowledge representation requires the grounding of embodied experience. Despite learning from limited modalities (e.g., text for GPT-3.5, and text+image for GPT-4), LLMs have nevertheless demonstrated human-like behaviors in various psychology tasks, which may provide an alternative interpretation of the acquisition of conceptual knowledge. We compared lexical conceptual representations between humans and ChatGPT (GPT-3.5 and GPT-4) on subjective ratings of various lexical conceptual features or dimensions (e.g., emotional arousal, concreteness, haptic, etc.). The results show that both GPT-3.5 and GPT-4 were strongly correlated with humans in some abstract dimensions, such as emotion and salience. In dimensions related to sensory and motor domains, GPT-3.5 shows weaker correlations while GPT-4 has made significant progress compared to GPT-3.5. Still, GPT-4 struggles to fully capture motor aspects of conceptual knowledge such as actions with foot/leg, mouth/throat, and torso. Moreover, we found that GPT-4's progress can largely be associated with its training in the visual domain. Certain aspects of conceptual representation appear to exhibit a degree of independence from sensory capacities, but others seem to necessitate them. Our findings provide insights into the complexities of knowledge representation from diverse perspectives and highlights the potential influence of embodied experience in shaping language and cognition.
翻译:近期大语言模型的进展有望揭示知识表征在多大程度上需要具身体验的奠基。尽管学习模态有限(例如,GPT-3.5仅基于文本,GPT-4基于文本和图像),但LLMs在多种心理学任务中仍展现出类似人类的行为,这可能为概念知识的获取提供另一种解读。我们比较了人类与ChatGPT(GPT-3.5和GPT-4)在多种词汇概念特征或维度(如情绪唤醒度、具体性、触觉等)的主观评分上的词汇概念表征。结果表明,在情绪和显著性等抽象维度上,GPT-3.5和GPT-4均与人类表现出强相关性。在感觉和运动领域相关的维度上,GPT-3.5的相关性较弱,而GPT-4相较于GPT-3.5取得了显著进步。然而,GPT-4仍难以完全捕捉概念知识的运动方面,例如涉及脚/腿、口/喉和躯干的动作。此外,我们发现GPT-4的进步在很大程度上可归因于其在视觉领域的训练。概念表征的某些方面似乎与感觉能力存在一定程度的独立性,但其他方面则依赖于它们。我们的发现从多元视角为知识表征的复杂性提供了洞见,并凸显了具身体验在塑造语言与认知中的潜在影响。