Interpretable Graph-Level Anomaly Detection via Contrast with Normal Prototypes

The task of graph-level anomaly detection (GLAD) is to identify anomalous graphs that deviate significantly from the majority of graphs in a dataset. While deep GLAD methods have shown promising performance, their black-box nature limits their reliability and deployment in real-world applications. Although some recent methods have made attempts to provide explanations for anomaly detection results, they either provide explanations without referencing normal graphs, or rely on abstract latent vectors as prototypes rather than concrete graphs from the dataset. To address these limitations, we propose Prototype-based Graph-Level Anomaly Detection (ProtoGLAD), an interpretable unsupervised framework that provides explanation for each detected anomaly by explicitly contrasting with its nearest normal prototype graph. It employs a point-set kernel to iteratively discover multiple normal prototype graphs and their associated clusters from the dataset, then identifying graphs distant from all discovered normal clusters as anomalies. Extensive experiments on multiple real-world datasets demonstrate that ProtoGLAD achieves competitive anomaly detection performance compared to state-of-the-art GLAD methods while providing better human-interpretable prototype-based explanations.

翻译：图级异常检测（GLAD）的任务是识别与数据集中大多数图显著偏离的异常图。尽管深度GLAD方法已展现出优异的性能，但其黑盒特性限制了在实际应用中的可靠性和部署。虽然近期部分方法尝试为异常检测结果提供解释，但它们要么在不参考正常图的情况下提供解释，要么依赖抽象的潜在向量作为原型而非数据集中的具体图。为解决这些局限性，我们提出基于原型的图级异常检测（ProtoGLAD），这是一个可解释的无监督框架，通过将每个检测到的异常与其最近邻的正常原型图进行显式对比来提供解释。该方法采用点集核函数从数据集中迭代发现多个正常原型图及其关联簇，进而将远离所有已发现正常簇的图识别为异常。在多个真实数据集上的大量实验表明，与最先进的GLAD方法相比，ProtoGLAD在实现具有竞争力的异常检测性能的同时，能提供更易于人类理解的基于原型的解释。

相关内容

异常检测

关注 102

在数据挖掘中，异常检测（英语：anomaly detection）对不符合预期模式或数据集中其他项目的项目、事件或观测值的识别。通常异常项目会转变成银行欺诈、结构缺陷、医疗问题、文本错误等类型的问题。异常也被称为离群值、新奇、噪声、偏差和例外。特别是在检测滥用与网络入侵时，有趣性对象往往不是罕见对象，但却是超出预料的突发活动。这种模式不遵循通常统计定义中把异常点看作是罕见对象，于是许多异常检测方法（特别是无监督的方法）将对此类数据失效，除非进行了合适的聚集。相反，聚类分析算法可能可以检测出这些模式形成的微聚类。有三大类异常检测方法。[1] 在假设数据集中大多数实例都是正常的前提下，无监督异常检测方法能通过寻找与其他数据最不匹配的实例来检测出未标记测试数据的异常。监督式异常检测方法需要一个已经被标记“正常”与“异常”的数据集，并涉及到训练分类器（与许多其他的统计分类问题的关键区别是异常检测的内在不均衡性）。半监督式异常检测方法根据一个给定的正常训练数据集创建一个表示正常行为的模型，然后检测由学习模型生成的测试实例的可能性。

【AAAI 2026 教程】面向图数据异常活动检测的基础模型研究进展

专知会员服务

24+阅读 · 1月26日

基于扩散模型的异常检测综述

专知会员服务

30+阅读 · 2025年1月23日

深度图异常检测：综述与新视角

专知会员服务

14+阅读 · 2024年9月19日

【NeurIPS2023】朝向自解释的图级异常检测

专知会员服务

30+阅读 · 2023年10月26日