Neural network models of language have long been used as a tool for developing hypotheses about conceptual representation in the mind and brain. For many years, such use involved extracting vector-space representations of words and using distances among these to predict or understand human behavior in various semantic tasks. In contemporary language AIs, however, it is possible to interrogate the latent structure of conceptual representations using methods nearly identical to those commonly used with human participants. The current work uses two common techniques borrowed from cognitive psychology to estimate and compare lexical-semantic structure in both humans and a well-known AI, the DaVinci variant of GPT-3. In humans, we show that conceptual structure is robust to differences in culture, language, and method of estimation. Structures estimated from AI behavior, while individually fairly consistent with those estimated from human behavior, depend much more upon the particular task used to generate behavior responses--responses generated by the very same model in the two tasks yield estimates of conceptual structure that cohere less with one another than do human structure estimates. The results suggest one important way that knowledge inhering in contemporary AIs can differ from human cognition.
翻译:长期以来,神经网络语言模型一直被用作理解心智与大脑中概念表征的假设工具。多年以来,此类应用涉及提取词语的向量空间表征,并利用这些表征之间的距离来预测或理解人类在各种语义任务中的行为。然而,在当代语言人工智能中,可以采用与人类被试实验中几乎相同的方法来探究概念表征的潜在结构。本研究借鉴认知心理学的两种常用技术,估算了人类及著名人工智能模型——GPT-3的DaVinci变体——的词汇语义结构,并进行比较。在人类中,我们表明概念结构对文化、语言及估算方法的差异具有稳健性。而由人工智能行为估算的结构,尽管个体上与人类行为估算的结果相当一致,但在很大程度上依赖于生成行为反应的具体任务——同一模型在两项任务中产生的行为反应所估算的概念结构,其相互一致性低于人类结构估算的结果。这些结果揭示了当代人工智能中内隐的知识与人类认知存在一种重要差异。