Graph Anomaly Detection (GAD) has recently become a hot research spot due to its practicability and theoretical value. Since GAD emphasizes the application and the rarity of anomalous samples, enriching the varieties of its datasets is fundamental work. Thus, this paper present DGraph, a real-world dynamic graph in the finance domain. DGraph overcomes many limitations of current GAD datasets. It contains about 3M nodes, 4M dynamic edges, and 1M ground-truth nodes. We provide a comprehensive observation of DGraph, revealing that anomalous nodes and normal nodes generally have different structures, neighbor distribution, and temporal dynamics. Moreover, it suggests that unlabeled nodes are also essential for detecting fraudsters. Furthermore, we conduct extensive experiments on DGraph. Observation and experiments demonstrate that DGraph is propulsive to advance GAD research and enable in-depth exploration of anomalous nodes.
翻译:图异常检测(GAD)因其实用性和理论价值,近年来已成为研究热点。由于GAD强调应用场景以及异常样本的稀缺性,丰富其数据集的多样性是一项基础性工作。为此,本文提出DGraph——一个金融领域的真实动态图。DGraph克服了当前GAD数据集存在的诸多局限,其包含约300万个节点、400万条动态边以及100万个真实标签节点。我们对DGraph进行了全面观察,揭示出异常节点与正常节点通常具有不同的结构特征、邻居分布以及时序动态特性。此外,研究还表明未标注节点对欺诈检测同样至关重要。基于此,我们在DGraph上开展了大量实验。观察结果与实验证明,DGraph能够推动GAD研究的进展,并促进对异常节点的深入探索。