Node-level graph anomaly detection (GAD) plays a critical role in identifying anomalous nodes from graph-structured data in various domains such as medicine, social networks, and e-commerce. However, challenges have arisen due to the diversity of anomalies and the dearth of labeled data. Existing methodologies - reconstruction-based and contrastive learning - while effective, often suffer from efficiency issues, stemming from their complex objectives and elaborate modules. To improve the efficiency of GAD, we introduce a simple method termed PREprocessing and Matching (PREM for short). Our approach streamlines GAD, reducing time and memory consumption while maintaining powerful anomaly detection capabilities. Comprising two modules - a pre-processing module and an ego-neighbor matching module - PREM eliminates the necessity for message-passing propagation during training, and employs a simple contrastive loss, leading to considerable reductions in training time and memory usage. Moreover, through rigorous evaluations of five real-world datasets, our method demonstrated robustness and effectiveness. Notably, when validated on the ACM dataset, PREM achieved a 5% improvement in AUC, a 9-fold increase in training speed, and sharply reduce memory usage compared to the most efficient baseline.
翻译:节点级图异常检测(GAD)在医学、社交网络和电子商务等领域的图结构数据中识别异常节点方面发挥着关键作用。然而,由于异常类型的多样性和标注数据的匮乏,该领域面临诸多挑战。现有方法——基于重构的方法和对比学习——虽然有效,但由于其复杂的目标函数和精心设计的模块,往往存在效率问题。为提升GAD的效率,我们提出了一种名为“预处理与匹配”(简称PREM)的简单方法。该方法简化了GAD流程,在保持强大异常检测能力的同时降低了时间和内存消耗。PREM由两个模块组成——预处理模块与自我-邻域匹配模块——消除了训练过程中消息传递传播的必要性,并采用简单的对比损失函数,从而显著减少训练时间和内存使用。此外,通过在五个真实世界数据集上的严格评估,我们的方法展现了鲁棒性和有效性。值得注意的是,在ACM数据集上的验证表明,与最高效的基线方法相比,PREM的AUC提升了5%,训练速度提高了9倍,并大幅降低了内存使用量。