A central component of rational behavior is logical inference: the process of determining which conclusions follow from a set of premises. Psychologists have documented several ways in which humans' inferences deviate from the rules of logic. Do language models, which are trained on text generated by humans, replicate such human biases, or are they able to overcome them? Focusing on the case of syllogisms -- inferences from two simple premises -- we show that, within the PaLM2 family of transformer language models, larger models are more logical than smaller ones, and also more logical than humans. At the same time, even the largest models make systematic errors, some of which mirror human reasoning biases: they show sensitivity to the (irrelevant) ordering of the variables in the syllogism, and draw confident but incorrect inferences from particular syllogisms (syllogistic fallacies). Overall, we find that language models often mimic the human biases included in their training data, but are able to overcome them in some cases.
翻译:理性行为的核心组成部分是逻辑推理:即从一组前提中确定哪些结论成立的过程。心理学家记录了人类推理偏离逻辑规则的若干方式。那么,基于人类生成文本训练的语言模型,是会复制这类人类偏见,还是能够克服它们?本文聚焦三段论(从两个简单前提进行推理的案例),发现在PaLM2系列Transformer语言模型中,较大模型比较小模型更具逻辑性,且比人类更符合逻辑。同时,即使最大的模型也会犯系统性错误,其中部分错误反映了人类的推理偏差:它们对三段论中变量(不相关的)顺序表现出敏感性,并会从特定三段论(三段论谬误)中得出自信但错误的结论。总体而言,我们发现语言模型常模仿其训练数据中包含的人类偏见,但在某些情况下能够克服这些偏见。