In anomaly detection, the degree of irregularity is often summarized as a real-valued anomaly score. We address the problem of attributing such anomaly scores to input features for interpreting the results of anomaly detection. We particularly investigate the use of the Shapley value for attributing anomaly scores of semi-supervised detection methods. We propose a characteristic function specifically designed for attributing anomaly scores. The idea is to approximate the absence of some features by locally minimizing the anomaly score with regard to the to-be-absent features. We examine the applicability of the proposed characteristic function and other general approaches for interpreting anomaly scores on multiple datasets and multiple anomaly detection methods. The results indicate the potential utility of the attribution methods including the proposed one.
翻译:在异常检测中,不规则程度通常以实值异常分数进行概括。我们针对将此类异常分数归因于输入特征以解释异常检测结果的问题展开研究,特别探讨了在归因半监督检测方法的异常分数时使用Shapley值的方法。我们提出了一种专门用于异常分数归因的特征函数,其核心思想是通过对缺失特征进行局部最小化异常分数来近似特征的缺失情况。我们在多个数据集和多种异常检测方法上,检验了所提出的特征函数及其他通用方法在解释异常分数方面的适用性。结果表明,包括所提方法在内的归因方法具有潜在应用价值。