Causal investigations in observational studies pose a great challenge in research where randomized trials or intervention-based studies are not feasible. We develop an information geometric causal discovery and inference framework of "predictive asymmetry". For $(X, Y)$, predictive asymmetry enables assessment of whether $X$ is more likely to cause $Y$ or vice-versa. The asymmetry between cause and effect becomes particularly simple if $X$ and $Y$ are deterministically related. We propose a new metric called the Directed Mutual Information ($DMI$) and establish its key statistical properties. $DMI$ is not only able to detect complex non-linear association patterns in bivariate data, but also is able to detect and infer causal relations. Our proposed methodology relies on scalable non-parametric density estimation using Fourier transform. The resulting estimation method is manyfold faster than the classical bandwidth-based density estimation. We investigate key asymptotic properties of the $DMI$ methodology and a data-splitting technique is utilized to facilitate causal inference using the $DMI$. Through simulation studies and an application, we illustrate the performance of $DMI$.
翻译:在随机试验或干预研究不可行的观测研究中,因果探究是一项重大挑战。我们发展了一种基于信息几何的"预测非对称性"因果发现与推断框架。对于$(X, Y)$而言,预测非对称性能够评估$X$更可能是$Y$的原因还是反之。当$X$与$Y$存在确定性关系时,因果间的非对称性变得尤为简洁。我们提出了一种名为定向互信息(Directed Mutual Information, $DMI$)的新度量,并确立了其关键统计性质。$DMI$不仅能检测双变量数据中复杂的非线性关联模式,还能识别并推断因果关系。所提方法依赖于基于傅里叶变换的可扩展非参数密度估计,其估计速度较经典带宽密度估计方法快数倍。我们研究了$DMI$方法论的关键渐近性质,并利用数据拆分技术实现基于$DMI$的因果推断。通过仿真实验与实例应用,我们展示了$DMI$的性能表现。