Causal phenomena associated with rare events occur across a wide range of engineering problems, such as risk-sensitive safety analysis, accident analysis and prevention, and extreme value theory. However, current methods for causal discovery are often unable to uncover causal links, between random variables in a dynamic setting, that manifest only when the variables first experience low-probability realizations. To address this issue, we introduce a novel statistical independence test on data collected from time-invariant dynamical systems in which rare but consequential events occur. In particular, we exploit the time-invariance of the underlying data to construct a superimposed dataset of the system state before rare events happen at different timesteps. We then design a conditional independence test on the reorganized data. We provide non-asymptotic sample complexity bounds for the consistency of our method, and validate its performance across various simulated and real-world datasets, including incident data collected from the Caltrans Performance Measurement System (PeMS). Code containing the datasets and experiments is publicly available.
翻译:与稀有事件相关的因果现象广泛存在于各类工程问题中,例如风险敏感性安全分析、事故分析与预防以及极值理论。然而,当前的因果发现方法往往无法揭示动态环境中随机变量之间的因果关系——这些关系仅在变量首次经历低概率实现时才会显现。为解决这一问题,我们提出了一种新颖的统计独立性检验方法,适用于从时间不变动态系统中采集的、包含稀有但重要事件的数据。具体而言,我们利用底层数据的时间不变性,在稀有事件发生于不同时间步之前,构建系统状态的叠加数据集。随后,我们在重组后的数据上设计条件独立性检验。我们为该方法的一致性提供了非渐近样本复杂度界,并在多种模拟与真实数据集(包括加州运输部性能测量系统(PeMS)采集的事故数据)上验证了其性能。包含数据集与实验的代码已公开。