Comparing two samples of data, we observe a change in the distribution of an outcome variable. In the presence of multiple explanatory variables, how much of the change can be explained by each possible cause? We develop a new estimation strategy that, given a causal model, combines regression and re-weighting methods to quantify the contribution of each causal mechanism. Our proposed methodology is multiply robust, meaning that it still recovers the target parameter under partial misspecification. We prove that our estimator is consistent and asymptotically normal. Moreover, it can be incorporated into existing frameworks for causal attribution, such as Shapley values, which will inherit the consistency and large-sample distribution properties. Our method demonstrates excellent performance in Monte Carlo simulations, and we show its usefulness in an empirical application. Our method is implemented as part of the Python library DoWhy (arXiv:2011.04216, arXiv:2206.06821).
翻译:比较两个数据样本时,我们观察到结果变量分布发生了变化。在存在多个解释变量的情况下,每个潜在原因能够解释多少变化?我们提出了一种新的估计策略,在给定因果模型的前提下,结合回归与重加权方法来量化每个因果机制的贡献。我们提出的方法具有多重稳健性,这意味着即使在部分模型设定错误的情况下,仍能恢复目标参数。我们证明了该估计量具有一致性和渐近正态性。此外,该方法可融入现有的因果归因框架(如Shapley值),并继承其一致性与大样本分布特性。蒙特卡洛模拟显示我们的方法具有优异性能,实证应用也证明了其实用价值。本方法已作为Python库DoWhy(arXiv:2011.04216, arXiv:2206.06821)的组成部分实现。