The goal of Root Cause Analysis (RCA) is to explain why an anomaly occurred by identifying where the fault originated. Several recent works model the anomalous event as resulting from a change in the causal mechanism at the root cause, i.e., as a soft intervention. RCA is then the task of identifying which causal mechanism changed. In real-world applications, one often has either few or only a single sample from the post-intervention distribution: a severe limitation for most methods, which assume one knows or can estimate the distribution. However, even those that do not are statistically ill-posed due to the need to probe regression models in regions of low probability density. In this paper, we propose simple, efficient methods to overcome both difficulties in the case where there is a single root cause and the causal graph is a polytree. When one knows the causal graph, we give guarantees for a traversal algorithm that requires only marginal anomaly scores and does not depend on specifying an arbitrary anomaly score cut-off. When one does not know the causal graph, we show that the heuristic of identifying root causes as the variables with the highest marginal anomaly scores is causally justified. To this end, we prove that anomalies with small scores are unlikely to cause those with larger scores in polytrees and give upper bounds for the likelihood of causal pathways with non-monotonic anomaly scores.
翻译:根因分析(RCA)的目标是通过识别故障起源来解释异常发生的原因。近年来的若干研究将异常事件建模为根节点处因果机制变化(即软干预)所导致的结果。此时RCA任务即转化为识别发生改变的因果机制。在实际应用中,研究者通常只能从干预后分布中获得极少甚至单个样本:这对大多数方法构成了严重限制,因为这些方法默认已知或可估计该分布。然而,即使不依赖分布假设的方法也会因需要在低概率密度区域探测回归模型而面临统计不适定问题。本文针对单根因且因果图为多叉树的情形,提出了克服上述双重困难的简洁高效方法。当因果图已知时,我们为遍历算法提供理论保证,该算法仅需边缘异常分数且无需设定任意异常分数阈值。当因果图未知时,我们证明将边缘异常分数最高的变量识别为根因的启发式方法具有因果合理性。为此,我们证明了在多叉树结构中低异常分数变量引发高异常分数异常的可能性较低,并对非单调异常分数的因果路径可能性给出了上界估计。