Graph Neural Networks (GNNs) have become increasingly popular for effectively modeling graph-structured data, and attention mechanisms have been pivotal in enabling these models to capture complex patterns. In our study, we reveal a critical yet underexplored consequence of integrating attention into edge-featured GNNs: the emergence of Massive Activations (MAs) within attention layers. By developing a novel method for detecting MAs on edge features, we show that these extreme activations are not only activation anomalies but encode domain-relevant signals. Our post-hoc interpretability analysis demonstrates that, in molecular graphs, MAs aggregate predominantly on common bond types (e.g., single and double bonds) while sparing more informative ones (e.g., triple bonds). Furthermore, our ablation studies confirm that MAs can serve as natural attribution indicators, reallocating to less informative edges. Our study assesses various edge-featured attention-based GNN models using benchmark datasets, including ZINC, TOX21, and PROTEINS. Key contributions include (1) establishing the direct link between attention mechanisms and MAs generation in edge-featured GNNs, (2) developing a robust definition and detection method for MAs enabling reliable post-hoc interpretability. Overall, our study reveals the complex interplay between attention mechanisms, edge-featured GNNs model, and MAs emergence, providing crucial insights for relating GNNs internals to domain knowledge.
翻译:图神经网络(GNNs)在有效建模图结构数据方面日益流行,而注意力机制对于使这些模型能够捕获复杂模式至关重要。在我们的研究中,我们揭示了将注意力集成到具有边特征的GNNs中一个关键但尚未被充分探索的后果:注意力层内出现大规模激活(MAs)。通过开发一种检测边特征上MAs的新方法,我们证明这些极端激活不仅是激活异常,而且编码了与领域相关的信号。我们的后验可解释性分析表明,在分子图中,MAs主要聚集在常见的键类型(例如单键和双键)上,而避开了信息量更大的键(例如三键)。此外,我们的消融研究证实,MAs可以作为自然的归因指标,将注意力重新分配给信息量较少的边。我们的研究使用基准数据集(包括ZINC、TOX21和PROTEINS)评估了各种基于注意力的边特征GNN模型。主要贡献包括:(1)建立了边特征GNNs中注意力机制与MAs生成之间的直接联系;(2)开发了MAs的稳健定义和检测方法,从而实现可靠的后验可解释性。总体而言,我们的研究揭示了注意力机制、边特征GNN模型与MAs出现之间的复杂相互作用,为将GNNs内部机制与领域知识联系起来提供了关键见解。