Neural network models of language have long been used as a tool for developing hypotheses about conceptual representation in the mind and brain. For many years, such use involved extracting vector-space representations of words and using distances among these to predict or understand human behavior in various semantic tasks. Contemporary large language models (LLMs), however, make it possible to interrogate the latent structure of conceptual representations using experimental methods nearly identical to those commonly used with human participants. The current work utilizes three common techniques borrowed from cognitive psychology to estimate and compare the structure of concepts in humans and a suite of LLMs. In humans, we show that conceptual structure is robust to differences in culture, language, and method of estimation. Structures estimated from LLM behavior, while individually fairly consistent with those estimated from human behavior, vary much more depending upon the particular task used to generate responses--across tasks, estimates of conceptual structure from the very same model cohere less with one another than do human structure estimates. These results highlight an important difference between contemporary LLMs and human cognition, with implications for understanding some fundamental limitations of contemporary machine language.
翻译:语言神经网络模型长期以来被用作研究心智与大脑中概念表征假设的工具。多年来,这类研究通常通过提取词汇的向量空间表征,并利用这些向量之间的距离来预测或理解人类在各种语义任务中的行为。然而,当代大语言模型(LLMs)使得我们能够采用与人类受试者实验中几乎完全相同的实验方法来探究概念表征的潜在结构。本研究借鉴认知心理学中的三种常用技术,对人类与一系列大语言模型的概念结构进行了估算与比较。结果显示,人类的概念结构对文化、语言以及估算方法的差异具有稳健性。而从大语言模型行为中估算出的结构,虽然各自与人类行为估算结果具有相当的一致性,但其稳定性却严重依赖于生成响应所使用的特定任务——在不同任务间,同一模型估算出的概念结构之间的一致性甚至低于不同人类受试者的结构估算结果。这些结果凸显了当代大语言模型与人类认知之间的重要差异,并对理解当代机器语言的一些根本局限性具有启示意义。