Local Attention-guided Message Passing Mechanism (LAMP) adopted in Graph Attention Networks (GATs) is designed to adaptively learn the importance of neighboring nodes for better local aggregation on the graph, which can bring the representations of similar neighbors closer effectively, thus showing stronger discrimination ability. However, existing GATs suffer from a significant discrimination ability decline in heterophilic graphs because the high proportion of dissimilar neighbors can weaken the self-attention of the central node, jointly resulting in the deviation of the central node from similar nodes in the representation space. This kind of effect generated by neighboring nodes is called the Distraction Effect (DE) in this paper. To estimate and weaken the DE of neighboring nodes, we propose a Causally graph Attention network for Trimming heterophilic graph (CAT). To estimate the DE, since the DE are generated through two paths (grab the attention assigned to neighbors and reduce the self-attention of the central node), we use Total Effect to model DE, which is a kind of causal estimand and can be estimated from intervened data; To weaken the DE, we identify the neighbors with the highest DE (we call them Distraction Neighbors) and remove them. We adopt three representative GATs as the base model within the proposed CAT framework and conduct experiments on seven heterophilic datasets in three different sizes. Comparative experiments show that CAT can improve the node classification accuracy of all base GAT models. Ablation experiments and visualization further validate the enhancement of discrimination ability brought by CAT. The source code is available at https://github.com/GeoX-Lab/CAT.
翻译:摘要:图注意力网络(GATs)中采用的局部注意力引导消息传递机制(LAMP)旨在自适应学习邻居节点的重要性,从而在图上实现更好的局部聚合。该机制能有效拉近相似节点的表示距离,因而展现出更强的判别能力。然而,现有GATs在异配图上会出现显著的判别能力下降,这是因为异配图中高比例的不相似邻节点会削弱中心节点的自注意力,导致中心节点在表示空间中偏离相似节点。本文将邻节点产生的这种效应称为"干扰效应"(Distraction Effect, DE)。为估计并削弱邻节点的干扰效应,我们提出了一种用于修剪异配图的因果图注意力网络(Causally graph Attention network for Trimming heterophilic graph, CAT)。为估计干扰效应,由于其通过两条路径产生(争夺分配给邻居的注意力并减少中心节点的自注意力),我们使用总效应(Total Effect)来建模干扰效应——这是一种因果估计量,可通过干预数据估算;为削弱干扰效应,我们识别出干扰效应最高的邻节点(称为"干扰邻居")并将其移除。我们在提出的CAT框架中采用三种代表性GAT作为基础模型,并在七个不同规模的异配数据集上开展实验。对比实验表明,CAT能提升所有基础GAT模型的节点分类准确率。消融实验与可视化进一步验证了CAT对判别能力的增强效果。源代码可在https://github.com/GeoX-Lab/CAT获取。