We consider the problem of detecting causal relationships between discrete time series, in the presence of potential confounders. A hypothesis test is introduced for identifying the temporally causal influence of $(x_n)$ on $(y_n)$, causally conditioned on a possibly confounding third time series $(z_n)$. Under natural Markovian modeling assumptions, it is shown that the null hypothesis, corresponding to the absence of temporally causal influence, is equivalent to the underlying `causal conditional directed information rate' being equal to zero. The plug-in estimator for this functional is identified with the log-likelihood ratio test statistic for the desired test. This statistic is shown to be asymptotically normal under the alternative hypothesis and asymptotically $\chi^2$ distributed under the null, facilitating the computation of $p$-values when used on empirical data. The effectiveness of the resulting hypothesis test is illustrated on simulated data, validating the underlying theory. The test is also employed in the analysis of spike train data recorded from neurons in the V4 and FEF brain regions of behaving animals during a visual attention task. There, the test results are seen to identify interesting and biologically relevant information.
翻译:本文研究在存在潜在混杂因素的情况下检测离散时间序列之间因果关系的难题。我们提出一种假设检验方法,用于识别序列$(x_n)$对$(y_n)$的时序因果影响,其因果效应以可能的混杂时间序列$(z_n)$为条件。在自然的马尔可夫建模假设下,研究表明对应无时序因果影响的原假设等价于底层"因果条件有向信息率"为零。该泛函的插件估计量被识别为所需检验的对数似然比检验统计量。该统计量在备择假设下渐近服从正态分布,在原假设下渐近服从$\chi^2$分布,从而便于在实证数据应用中计算$p$值。通过模拟数据验证了所提假设检验的有效性及底层理论。该检验还被应用于分析视觉注意任务中行为动物V4和FEF脑区神经元的脉冲串记录数据,结果表明该检验能识别出有趣且具有生物学意义的信息。