Compositional relational reasoning (CRR) is a hallmark of human intelligence, but we lack a clear understanding of whether and how existing transformer large language models (LLMs) can solve CRR tasks. To enable systematic exploration of the CRR capability of LLMs, we first propose a new synthetic benchmark called Generalized Associative Recall (GAR) by integrating and generalizing the essence of several tasks in mechanistic interpretability (MI) study in a unified framework. Evaluation shows that GAR is challenging enough for existing LLMs, revealing their fundamental deficiency in CRR. Meanwhile, it is easy enough for systematic MI study. Then, to understand how LLMs solve GAR tasks, we use attribution patching to discover the core circuits reused by Vicuna-33B across different tasks and a set of vital attention heads. Intervention experiments show that the correct functioning of these heads significantly impacts task performance. Especially, we identify two classes of heads whose activations represent the abstract notion of true and false in GAR tasks respectively. They play a fundamental role in CRR across various models and tasks. The dataset and code are available at https://github.com/Caiyun-AI/GAR.
翻译:组合关系推理是人类智能的显著特征,但我们对于现有基于Transformer架构的大型语言模型是否及如何解决此类任务仍缺乏清晰认知。为系统探究LLMs的组合关系推理能力,我们首先通过整合并概括机制可解释性研究中多项任务的核心要素,在一个统一框架下提出了名为广义关联记忆的新型合成基准。评估表明,该基准对现有LLMs具有足够挑战性,揭示了它们在组合关系推理方面的根本性缺陷,同时其难度又恰好适用于系统的机制可解释性研究。随后,为理解LLMs如何解决广义关联记忆任务,我们采用归因修补技术发现了Vicuna-33B在不同任务间复用的核心电路及一组关键注意力头。干预实验表明这些注意力头的正常功能对任务性能具有显著影响。特别地,我们识别出两类注意力头,其激活分别表征了广义关联记忆任务中真与假的抽象概念。这些注意力头在不同模型与任务中均对组合关系推理发挥着基础性作用。数据集与代码已公开于https://github.com/Caiyun-AI/GAR。