Adjusting for confounding and imbalance when establishing statistical relationships is an increasingly important task, and causal inference methods have emerged as the most popular tool to achieve this. Causal inference has been developed mainly for scalar outcomes and recently for distributional outcomes. We introduce here a general framework for causal inference when outcomes reside in general geodesic metric spaces, where we draw on a novel geodesic calculus that facilitates scalar multiplication for geodesics and the characterization of treatment effects through the concept of the geodesic average treatment effect. Using ideas from Fr\'echet regression, we develop estimation methods of the geodesic average treatment effect and derive consistency and rates of convergence for the proposed estimators. We also study uncertainty quantification and inference for the treatment effect. Our methodology is illustrated by a simulation study and real data examples for compositional outcomes of U.S. statewise energy source data to study the effect of coal mining, network data of New York taxi trips, where the effect of the COVID-19 pandemic is of interest, and brain functional connectivity network data to study the effect of Alzheimer's disease.
翻译:在建立统计关系时调整混杂因素与不平衡性是一项日益重要的任务,而因果推断方法已成为实现这一目标的主流工具。因果推断方法最初主要针对标量结果开发,近年来逐渐扩展至分布型结果。本文提出一个适用于结果存在于一般测地度量空间的通用因果推断框架,其中我们运用了一种新颖的测地线演算方法,该方法促进了测地线的标量乘法运算,并通过测地线平均处理效应的概念来刻画处理效应。基于Fr\'echet回归的思想,我们开发了测地线平均处理效应的估计方法,并为所提出的估计量推导了一致性与收敛速率。同时,我们还研究了处理效应的不确定性量化与统计推断。通过模拟研究及真实数据案例验证了本方法的实用性:包括用于研究煤炭开采影响的美国各州能源结构成分数据、探究COVID-19疫情影响的纽约出租车行程网络数据,以及研究阿尔茨海默病影响的脑功能连接网络数据。