Causal discovery from time series data is a typical problem setting across the sciences. Often, multiple datasets of the same system variables are available, for instance, time series of river runoff from different catchments. The local catchment systems then share certain causal parents, such as time-dependent large-scale weather over all catchments, but differ in other catchment-specific drivers, such as the altitude of the catchment. These drivers can be called temporal and spatial contexts, respectively, and are often partially unobserved. Pooling the datasets and considering the joint causal graph among system, context, and certain auxiliary variables enables us to overcome such latent confounding of system variables. In this work, we present a non-parametric time series causal discovery method, J(oint)-PCMCI+, that efficiently learns such joint causal time series graphs when both observed and latent contexts are present, including time lags. We present asymptotic consistency results and numerical experiments demonstrating the utility and limitations of the method.
翻译:时序数据中的因果发现是科学领域中的典型问题。通常,同一系统变量的多个数据集是可用的,例如,来自不同流域的河流径流时间序列。这些局部流域系统共享某些因果父节点(例如,所有流域中随时间变化的大尺度天气),但在其他流域特定驱动因素(如流域海拔)上存在差异。这些驱动因素可分别称为时间背景和空间背景,且往往部分未被观测到。对数据集进行整合,并考虑系统变量、背景变量及某些辅助变量之间的联合因果图,有助于克服系统变量之间的潜在混杂问题。本文提出了一种非参数时序数据因果发现方法——联合PCMCI+ (J-PCMCI+),该方法可在存在观测和潜在背景变量(包括时间滞后)的情况下,高效学习此类联合因果时序图。我们给出了渐近一致性结果,并通过数值实验展示了该方法的实用性与局限性。