CAT: A Causally Graph Attention Network for Trimming Heterophilic Graph

Local Attention-guided Message Passing Mechanism (LAMP) adopted in Graph Attention Networks (GATs) is designed to adaptively learn the importance of neighboring nodes for better local aggregation on the graph, which can bring the representations of similar neighbors closer effectively, thus showing stronger discrimination ability. However, existing GATs suffer from a significant discrimination ability decline in heterophilic graphs because the high proportion of dissimilar neighbors can weaken the self-attention of the central node, jointly resulting in the deviation of the central node from similar nodes in the representation space. This kind of effect generated by neighboring nodes is called the Distraction Effect (DE) in this paper. To estimate and weaken the DE of neighboring nodes, we propose a Causally graph Attention network for Trimming heterophilic graph (CAT). To estimate the DE, since the DE are generated through two paths (grab the attention assigned to neighbors and reduce the self-attention of the central node), we use Total Effect to model DE, which is a kind of causal estimand and can be estimated from intervened data; To weaken the DE, we identify the neighbors with the highest DE (we call them Distraction Neighbors) and remove them. We adopt three representative GATs as the base model within the proposed CAT framework and conduct experiments on seven heterophilic datasets in three different sizes. Comparative experiments show that CAT can improve the node classification accuracy of all base GAT models. Ablation experiments and visualization further validate the enhancement of discrimination ability brought by CAT. The source code is available at https://github.com/GeoX-Lab/CAT.

翻译：图注意力网络（GATs）中采用的局部注意力引导消息传递机制（LAMP）旨在自适应地学习相邻节点的重要性，以实现更好的图局部聚合，从而有效拉近相似邻居节点的表示，展现出更强的判别能力。然而，现有GATs在异质图中存在显著的判别能力下降问题，因为高比例的不相似邻居会削弱中心节点的自注意力，共同导致中心节点在表示空间中偏离相似节点。本文将此现象称为邻居节点产生的“干扰效应”。为估计并削弱邻居节点的干扰效应，我们提出了一种用于修剪异质图的因果图注意力网络（CAT）。在干扰效应估计方面，由于干扰效应通过两条路径产生（争夺分配给邻居的注意力并降低中心节点的自注意力），我们采用因果推断中的“总效应”作为建模工具，该效应可通过干预数据进行估计；在削弱干扰效应方面，我们识别具有最高干扰效应的邻居节点（称为“干扰邻居”）并将其移除。我们在CAT框架中采用三种代表性GAT作为基础模型，并在三种不同规模的七个异质图数据集上进行实验。对比实验表明，CAT能提升所有基础GAT模型的节点分类准确率。消融实验与可视化分析进一步验证了CAT带来的判别能力增强效果。源代码已发布于 https://github.com/GeoX-Lab/CAT。