Causal phenomena associated with rare events occur across a wide range of engineering problems, such as risk-sensitive safety analysis, accident analysis and prevention, and extreme value theory. However, current methods for causal discovery are often unable to uncover causal links, between random variables in a dynamic setting, that manifest only when the variables first experience low-probability realizations. To address this issue, we introduce a novel statistical independence test on data collected from time-invariant dynamical systems in which rare but consequential events occur. In particular, we exploit the time-invariance of the underlying data to construct a superimposed dataset of the system state before rare events happen at different timesteps. We then design a conditional independence test on the reorganized data. We provide non-asymptotic sample complexity bounds for the consistency of our method, and validate its performance across various simulated and real-world datasets, including incident data collected from the Caltrans Performance Measurement System (PeMS). Code containing the datasets and experiments is publicly available.
翻译:与罕见事件相关的因果现象广泛存在于各类工程问题中,例如风险敏感安全分析、事故分析与预防以及极值理论。然而,当前因果发现方法往往无法揭示动态环境中随机变量之间仅在变量首次经历低概率实现时才会显现的因果联系。为解决此问题,我们提出了一种针对时不变动态系统采集数据的新型统计独立性检验方法,其中将发生罕见但重要的事件。具体而言,我们利用底层数据的时不变性,构建了一个包含罕见事件发生前不同时间步系统状态的叠加数据集。随后,我们基于重组后的数据设计了一种条件独立性检验方法。我们给出了该方法一致性的非渐近样本复杂度界,并在多种模拟与真实数据集上验证了其性能,包括从加州交通绩效测量系统(PeMS)采集的事故数据。包含数据集与实验的代码已公开。