An extension of Transformers is proposed that enables explicit relational reasoning through a novel module called the Abstractor. At the core of the Abstractor is a variant of attention called relational cross-attention. The approach is motivated by an architectural inductive bias for relational learning that disentangles relational information from object-level features. This enables explicit relational reasoning, supporting abstraction and generalization from limited data. The Abstractor is first evaluated on simple discriminative relational tasks and compared to existing relational architectures. Next, the Abstractor is evaluated on purely relational sequence-to-sequence tasks, where dramatic improvements are seen in sample efficiency compared to standard Transformers. Finally, Abstractors are evaluated on a collection of tasks based on mathematical problem solving, where consistent improvements in performance and sample efficiency are observed.
翻译:提出了一种Transformer的扩展方法,通过名为抽象器的新型模块实现显式关系推理。该模块的核心是一种称为关系交叉注意力的注意力变体。该方法受到关系学习架构归纳偏置的启发,能够将关系信息与对象级特征解耦,从而实现显式关系推理,支持从有限数据中进行抽象与泛化。首先,在简单的判别式关系任务上评估抽象器,并与现有关系架构进行比较。接着,在纯关系序列到序列任务中评估抽象器,结果表明相较于标准Transformer,其样本效率显著提升。最后,在基于数学问题求解的多项任务中评估抽象器,观察到其在性能与样本效率方面均取得一致性改进。