Reasoning in terms of relations, analogies, and abstraction is a hallmark of human intelligence. An active debate is whether this relies on the use of symbolic processing or can be achieved using the same forms of function approximation that have been used for tasks such as image, audio, and, most recently, language processing. We propose an intermediate approach, motivated by principles of cognitive neuroscience, in which abstract symbols can emerge from distributed, neural representations under the influence of an inductive bias for learning that we refer to as a ``relational bottleneck.'' We present a framework that casts this inductive bias in terms of an extension of Transformers, in which specific types of attention mechanisms enforce the relational bottleneck and transform distributed symbols to implement a form of relational reasoning and abstraction. We theoretically analyze the class of relation functions the models can compute and empirically demonstrate superior sample-efficiency on relational tasks compared to standard Transformer architectures.
翻译:以关系、类比和抽象为基础的推理是人类智能的标志。一个活跃的争论在于,这种能力是否依赖于符号化处理的使用,还是可以通过与图像、音频以及最新语言处理任务中使用的相同函数逼近形式来实现。我们提出了一种基于认知神经科学原理的中间方法,在该方法中,抽象符号可以在我们称之为“关系瓶颈”的学习归纳偏置的影响下从分布式神经表征中涌现。我们构建了一个框架,将该归纳偏置转化为Transformer的扩展形式,其中特定类型的注意力机制强化了关系瓶颈,并转换分布式符号以实现一种关系推理与抽象。我们从理论上分析了模型可计算的关系函数类别,并在实证中证明,相比标准Transformer架构,该模型在关系型任务上具有更优的样本效率。