Unsupervised graph anomaly detection aims at identifying rare patterns that deviate from the majority in a graph without the aid of labels, which is important for a variety of real-world applications. Recent advances have utilized Graph Neural Networks (GNNs) to learn effective node representations by aggregating information from neighborhoods. This is motivated by the hypothesis that nodes in the graph tend to exhibit consistent behaviors with their neighborhoods. However, such consistency can be disrupted by graph anomalies in multiple ways. Most existing methods directly employ GNNs to learn representations, disregarding the negative impact of graph anomalies on GNNs, resulting in sub-optimal node representations and anomaly detection performance. While a few recent approaches have redesigned GNNs for graph anomaly detection under semi-supervised label guidance, how to address the adverse effects of graph anomalies on GNNs in unsupervised scenarios and learn effective representations for anomaly detection are still under-explored. To bridge this gap, in this paper, we propose a simple yet effective framework for Guarding Graph Neural Networks for Unsupervised Graph Anomaly Detection (G3AD). Specifically, G3AD introduces two auxiliary networks along with correlation constraints to guard the GNNs from inconsistent information encoding. Furthermore, G3AD introduces an adaptive caching module to guard the GNNs from solely reconstructing the observed data that contains anomalies. Extensive experiments demonstrate that our proposed G3AD can outperform seventeen state-of-the-art methods on both synthetic and real-world datasets.
翻译:无监督图异常检测旨在无需标签辅助的情况下,识别图中偏离多数模式的罕见异常模式,这对多种实际应用具有重要意义。近年来的研究利用图神经网络(GNN)通过聚合邻域信息学习有效的节点表示,其动机源于图中节点倾向于与邻域保持行为一致性的假设。然而,这种一致性可能因图异常而被多维度破坏。现有方法大多直接使用GNN学习表示,忽视了图异常对GNN的负面影响,导致节点表示和异常检测性能次优。虽然近期少数方法在半监督标签指导下重新设计了用于图异常检测的GNN,但在无监督场景中如何缓解图异常对GNN的不利影响,并学习有效表示用于异常检测,仍鲜有探索。为填补这一空白,本文提出一种简洁而有效的框架——面向无监督图异常检测的图神经网络保护框架(G3AD)。具体而言,G3AD引入两个辅助网络及关联约束,保护GNN免于编码不一致信息。此外,G3AD引入自适应缓存模块,阻止GNN仅重构包含异常的可观测数据。大量实验表明,我们的G3AD在合成数据集和真实数据集上均能超越十七种最先进方法。