A central component of rational behavior is logical inference: the process of determining which conclusions follow from a set of premises. Psychologists have documented several ways in which humans' inferences deviate from the rules of logic. Do language models, which are trained on text generated by humans, replicate these biases, or are they able to overcome them? Focusing on the case of syllogisms -- inferences from two simple premises, which have been studied extensively in psychology -- we show that larger models are more logical than smaller ones, and also more logical than humans. At the same time, even the largest models make systematic errors, some of which mirror human reasoning biases such as ordering effects and logical fallacies. Overall, we find that language models mimic the human biases included in their training data, but are able to overcome them in some cases.
翻译:理性行为的核心组成部分是逻辑推理:即从一组前提中确定哪些结论成立的过程。心理学家已记录下人类推理偏离逻辑规则的多种方式。那么,基于人类生成文本训练的语言模型,是会复刻这些偏差,还是能够克服它们?聚焦于三段论(一种在心理学中被广泛研究的、由两个简单前提推导结论的推理形式)这一典型案例,我们发现:更大规模的模型比小模型更具逻辑性,甚至比人类更具逻辑性。与此同时,即便是最大的模型也会出现系统性错误,其中部分错误反映了人类的推理偏差,例如顺序效应和逻辑谬误。总体而言,我们发现语言模型会模仿其训练数据中蕴含的人类偏差,但在某些情况下能够克服这些偏差。