With a long history of traditional Graph Anomaly Detection (GAD) algorithms and recently popular Graph Neural Networks (GNNs), it is still not clear (1) how they perform under a standard comprehensive setting, (2) whether GNNs can outperform traditional algorithms such as tree ensembles, and (3) how about their efficiency on large-scale graphs. In response, we introduce GADBench -- a benchmark tool dedicated to supervised anomalous node detection in static graphs. GADBench facilitates a detailed comparison across 29 distinct models on ten real-world GAD datasets, encompassing thousands to millions ($\sim$6M) nodes. Our main finding is that tree ensembles with simple neighborhood aggregation can outperform the latest GNNs tailored for the GAD task. We shed light on the current progress of GAD, setting a robust groundwork for subsequent investigations in this domain. GADBench is open-sourced at https://github.com/squareRoot3/GADBench.
翻译:在传统图异常检测(GAD)算法与近期流行的图神经网络(GNN)长期并存背景下,仍存在以下未明问题:(1)在标准综合设置下各算法性能如何;(2)GNN是否能够超越树集成等传统算法;(3)在大规模图上的效率表现。为此,我们提出GADBench——专用于静态图监督异常节点检测的基准工具。该工具在十个真实GAD数据集上(节点规模从数千至数百万,约600万节点)对29种不同模型进行详细对比。主要发现是:结合简单邻域聚合的树集成方法能够超越当前专为GAD任务设计的最新型GNN。本研究揭示了GAD领域的当前进展,为后续研究奠定了坚实基础。GADBench已在https://github.com/squareRoot3/GADBench开源。