We consider the problem of estimating the counterfactual joint distribution of multiple quantities of interests (e.g., outcomes) in a multivariate causal model extended from the classical difference-in-difference design. Existing methods for this task either ignore the correlation structures among dimensions of the multivariate outcome by considering univariate causal models on each dimension separately and hence produce incorrect counterfactual distributions, or poorly scale even for moderate-size datasets when directly dealing with such multivariate causal model. We propose a method that alleviates both issues simultaneously by leveraging a robust latent one-dimensional subspace of the original high-dimension space and exploiting the efficient estimation from the univariate causal model on such space. Since the construction of the one-dimensional subspace uses information from all the dimensions, our method can capture the correlation structures and produce good estimates of the counterfactual distribution. We demonstrate the advantages of our approach over existing methods on both synthetic and real-world data.
翻译:我们研究从经典双重差分设计扩展的多元因果模型中多个感兴趣量(如结果)的反事实联合分布估计问题。现有方法通过分别对每个维度建立单变量因果模型,忽略多元结果各维度间的相关结构,导致反事实分布估计不准确;而直接处理此类多元因果模型的方法即使对于中等规模数据集也面临扩展性差的挑战。我们提出一种方法,通过利用原始高维空间中鲁棒的潜在一维子空间,并在此空间上借助单变量因果模型的高效估计,同时解决上述两个问题。由于一维子空间的构建利用了所有维度的信息,我们的方法能够捕捉相关结构并产生良好的反事实分布估计。通过合成数据与真实数据的实验,我们证明了该方法相较现有方法的优势。