We introduce Attention Graphs, a new tool for mechanistic interpretability of Graph Neural Networks (GNNs) and Graph Transformers based on the mathematical equivalence between message passing in GNNs and the self-attention mechanism in Transformers. Attention Graphs aggregate attention matrices across Transformer layers and heads to describe how information flows among input nodes. Through experiments on homophilous and heterophilous node classification tasks, we analyze Attention Graphs from a network science perspective and find that: (1) When Graph Transformers are allowed to learn the optimal graph structure using all-to-all attention among input nodes, the Attention Graphs learned by the model do not tend to correlate with the input/original graph structure; and (2) For heterophilous graphs, different Graph Transformer variants can achieve similar performance while utilising distinct information flow patterns. Open source code: https://github.com/batu-el/understanding-inductive-biases-of-gnns
翻译:我们提出注意力图这一新工具,用于实现图神经网络(GNNs)和图Transformer的机制可解释性,其理论基础在于GNN中的消息传递与Transformer中自注意力机制的数学等价性。注意力图通过聚合Transformer各层和各注意力头的注意力矩阵,描述输入节点间的信息流动模式。通过在同配性与异配性节点分类任务上的实验,我们从网络科学视角分析注意力图并发现:(1)当允许图Transformer通过输入节点间的全连接注意力学习最优图结构时,模型习得的注意力图往往与输入/原始图结构不存在显著相关性;(2)对于异配性图,不同变体的图Transformer能够利用截然不同的信息流动模式实现相近的性能表现。开源代码:https://github.com/batu-el/understanding-inductive-biases-of-gnns