Graph-based anomaly detection is currently an important research topic in the field of graph neural networks (GNNs). We find that in graph anomaly detection, the homophily distribution differences between different classes are significantly greater than those in homophilic and heterophilic graphs. For the first time, we introduce a new metric called Class Homophily Variance, which quantitatively describes this phenomenon. To mitigate its impact, we propose a novel GNN model named Homophily Edge Generation Graph Neural Network (HedGe). Previous works typically focused on pruning, selecting or connecting on original relationships, and we refer to these methods as modifications. Different from these works, our method emphasizes generating new relationships with low class homophily variance, using the original relationships as an auxiliary. HedGe samples homophily adjacency matrices from scratch using a self-attention mechanism, and leverages nodes that are relevant in the feature space but not directly connected in the original graph. Additionally, we modify the loss function to punish the generation of unnecessary heterophilic edges by the model. Extensive comparison experiments demonstrate that HedGe achieved the best performance across multiple benchmark datasets, including anomaly detection and edgeless node classification. The proposed model also improves the robustness under the novel Heterophily Attack with increased class homophily variance on other graph classification tasks.
翻译:基于图的异常检测是当前图神经网络(GNN)领域的重要研究方向。我们发现,在图异常检测中,不同类别之间的同质性分布差异显著大于同质图和异质图。为此,我们首次引入了一种名为类别同质性方差(Class Homophily Variance)的新度量,用于定量描述这一现象。为缓解其影响,我们提出了一种新型GNN模型——同质性边生成图神经网络(Homophily Edge Generation Graph Neural Network, HedGe)。以往工作通常侧重于对原始关系进行剪枝、选择或连接,我们将这些方法称为修改(modifications)。与这些工作不同,我们的方法强调生成具有低类别同质性方差的新关系,并以原始关系作为辅助。HedGe利用自注意力机制从零开始采样同质性邻接矩阵,并充分利用特征空间中相关但原始图中无直接连边的节点。此外,我们修改了损失函数,以惩罚模型生成不必要的异质性边。大量对比实验表明,HedGe在包括异常检测和无边节点分类在内的多个基准数据集上均取得了最优性能。该模型还在其他图分类任务上提升了针对新型异质性攻击(Heterophily Attack)的鲁棒性,该攻击通过增加类别同质性方差实现。