Identifying root causes of anomalies in causal processes is vital across disciplines. Once identified, one can isolate the root causes and implement necessary measures to restore the normal operation. Causal processes are often modelled as graphs with entities being nodes and their paths/interconnections as edge. Existing work only consider the contribution of nodes in the generative process, thus can not attribute the outlier score to the edges of the mechanism if the anomaly occurs in the connections. In this paper, we consider both individual edge and node of each mechanism when identifying the root causes. We introduce a noisy functional causal model to account for this purpose. Then, we employ Bayesian learning and inference methods to infer the noises of the nodes and edges. We then represent the functional form of a target outlier leaf as a function of the node and edge noises. Finally, we propose an efficient gradient-based attribution method to compute the anomaly attribution scores which scales linearly with the number of nodes and edges. Experiments on simulated datasets and two real-world scenario datasets show better anomaly attribution performance of the proposed method compared to the baselines. Our method scales to larger graphs with more nodes and edges.
翻译:识别因果过程中异常的根因对于各学科至关重要。一旦确定根因,即可隔离问题并实施必要措施恢复系统正常运行。因果过程通常建模为以实体为节点、路径/连接为边的图结构。现有研究仅考虑生成过程中节点的贡献,当异常发生在连接关系中时,无法将异常评分归因于机制中的边。本文在识别根因时同时考虑每个机制的独立边和节点。为此,我们引入带噪声的函数因果模型。进而采用贝叶斯学习与推理方法推断节点和边的噪声。然后,将目标异常叶子的函数形式表示为节点噪声与边噪声的函数。最终,我们提出一种高效基于梯度的归因方法,计算异常归因分数,其计算复杂度与节点和边数量呈线性关系。在模拟数据集和两个真实场景数据集上的实验表明,与基线方法相比,所提方法具有更优的异常归因性能。本方法可扩展至包含更多节点和边的更大规模图结构。