Do large language models (LLMs) display rational reasoning? LLMs have been shown to contain human biases due to the data they have been trained on; whether this is reflected in rational reasoning remains less clear. In this paper, we answer this question by evaluating seven language models using tasks from the cognitive psychology literature. We find that, like humans, LLMs display irrationality in these tasks. However, the way this irrationality is displayed does not reflect that shown by humans. When incorrect answers are given by LLMs to these tasks, they are often incorrect in ways that differ from human-like biases. On top of this, the LLMs reveal an additional layer of irrationality in the significant inconsistency of the responses. Aside from the experimental results, this paper seeks to make a methodological contribution by showing how we can assess and compare different capabilities of these types of models, in this case with respect to rational reasoning.
翻译:大语言模型(LLMs)是否展现出理性推理?由于训练数据的特性,LLMs已被证实包含人类偏见;然而这种偏见是否在理性推理中有所体现尚不明确。本文通过认知心理学文献中的任务对七种语言模型进行了评估来回答这一问题。我们发现,与人类类似,LLMs在这些任务中表现出非理性行为。但与非理性表现方式不同,LLMs给出的错误答案往往与人类偏见存在差异。更关键的是,这些模型在回答一致性方面暴露出显著的非理性特征。除实验发现外,本文旨在作出方法论贡献,展示如何评估和比较此类模型的不同能力——本研究即聚焦于理性推理能力。