An extension of Transformers is proposed that enables explicit relational reasoning through a novel module called the Abstractor. At the core of the Abstractor is a variant of attention called relational cross-attention. The approach is motivated by an architectural inductive bias for relational learning that disentangles relational information from extraneous features about individual objects. This enables explicit relational reasoning, supporting abstraction and generalization from limited data. The Abstractor is first evaluated on simple discriminative relational tasks and compared to existing relational architectures. Next, the Abstractor is evaluated on purely relational sequence-to-sequence tasks, where dramatic improvements are seen in sample efficiency compared to standard Transformers. Finally, Abstractors are evaluated on a collection of tasks based on mathematical problem solving, where modest but consistent improvements in performance and sample efficiency are observed.
翻译:提出了一种 Transformer 的扩展方法,通过一个名为 Abstractor 的新模块实现显式关系推理。Abstractor 的核心是一种称为关系交叉注意力的注意力变体。该方法的动机源于一种用于关系学习的架构归纳偏置,该偏置能将关系信息与关于单个对象的外部特征分离开来。这使得显式关系推理成为可能,支持从有限数据中进行抽象和泛化。首先在简单的判别性关系任务上评估 Abstractor,并将其与现有关系架构进行比较。接着,在纯关系的序列到序列任务上评估 Abstractor,结果显示与标准 Transformer 相比,样本效率有显著提升。最后,基于数学问题求解的任务集合上评估 Abstractor,观察到性能和样本效率均获得了适度但一致的改进。