This paper investigates whether current large language models exhibit biases in logical reasoning, similar to humans. Specifically, we focus on syllogistic reasoning, a well-studied form of inference in the cognitive science of human deduction. To facilitate our analysis, we introduce a dataset called NeuBAROCO, originally designed for psychological experiments that assess human logical abilities in syllogistic reasoning. The dataset consists of syllogistic inferences in both English and Japanese. We examine three types of biases observed in human syllogistic reasoning: belief biases, conversion errors, and atmosphere effects. Our findings demonstrate that current large language models struggle more with problems involving these three types of biases.
翻译:本文研究当前大语言模型在逻辑推理中是否展现出与人类相似的偏差。具体而言,我们聚焦于三段论推理——人类演绎认知科学中一种研究充分的推理形式。为便于分析,我们引入名为NeuBAROCO的数据集,该数据集最初设计用于评估人类三段论逻辑能力的心理学实验。数据集包含英语和日语两种语言的三段论推理任务。我们考察人类三段论推理中观察到的三类偏差:信念偏差、转换错误和氛围效应。研究结果表明,当前大语言模型在处理涉及这三类偏差的问题时表现更为困难。