Causal phenomena associated with rare events occur across a wide range of engineering problems, such as risk-sensitive safety analysis, accident analysis and prevention, and extreme value theory. However, current methods for causal discovery are often unable to uncover causal links, between random variables in a dynamic setting, that manifest only when the variables first experience low-probability realizations. To address this issue, we introduce a novel statistical independence test on data collected from time-invariant dynamical systems in which rare but consequential events occur. In particular, we exploit the time-invariance of the underlying data to construct a superimposed dataset of the system state before rare events happen at different timesteps. We then design a conditional independence test on the reorganized data. We provide non-asymptotic sample complexity bounds for the consistency of our method, and validate its performance across various simulated and real-world datasets, including incident data collected from the Caltrans Performance Measurement System (PeMS). Code containing the datasets and experiments is publicly available.
翻译:与罕见事件相关的因果现象广泛存在于各类工程问题中,例如风险敏感性安全分析、事故分析与预防以及极值理论。然而,当前因果发现方法通常无法揭示动态环境中随机变量之间的因果联系——这种联系仅当变量首次经历低概率实现时才会显现。针对这一问题,我们提出了一种基于时间不变动态系统采集数据的新型统计独立性检验方法,该系统会产生罕见但具有重大影响的事件。具体而言,我们利用底层数据的时间不变性,构建了一个系统在罕见事件发生前不同时间步的状态叠加数据集,然后对重组后的数据设计条件独立性检验。我们给出了该方法一致性的非渐近样本复杂度界,并在多种模拟数据集和真实世界数据集(包括从加州交通绩效测量系统(PeMS)采集的事故数据)上验证了其性能。包含数据集和实验的代码已公开提供。