The recent advent of large language models has reinvigorated debate over whether human cognitive capacities might emerge in such generic models given sufficient training data. Of particular interest is the ability of these models to reason about novel problems zero-shot, without any direct training. In human cognition, this capacity is closely tied to an ability to reason by analogy. Here, we performed a direct comparison between human reasoners and a large language model (the text-davinci-003 variant of GPT-3) on a range of analogical tasks, including a novel text-based matrix reasoning task closely modeled on Raven's Progressive Matrices. We found that GPT-3 displayed a surprisingly strong capacity for abstract pattern induction, matching or even surpassing human capabilities in most settings. Our results indicate that large language models such as GPT-3 have acquired an emergent ability to find zero-shot solutions to a broad range of analogy problems.
翻译:近年来大语言模型的出现重新引发了关于此类通用模型在充足训练数据条件下是否会涌现人类认知能力的讨论。其中特别值得关注的是这些模型在零样本条件下对新颖问题进行推理的能力,这种能力无需任何直接训练。在人类认知中,这种能力与通过类比进行推理的能力密切相关。本研究在多种类比任务上对人类推理者与大语言模型(GPT-3的text-davinci-003变体)进行了直接比较,这些任务包含一项紧密模仿瑞文推理测验的新型文本矩阵推理任务。研究发现GPT-3展现出惊人的抽象模式归纳能力,在多数实验设置中能够匹配甚至超越人类能力水平。我们的结果表明,诸如GPT-3等大语言模型已获得一种涌现能力,能够为零样本条件下广泛类比的推理问题找到解决方案。