Various deep generative models have been proposed to estimate potential outcomes distributions from observational data. However, none of them have the favorable theoretical property of general Neyman-orthogonality and, associated with it, quasi-oracle efficiency and double robustness. In this paper, we introduce a general suite of generative Neyman-orthogonal (doubly-robust) learners that estimate the conditional distributions of potential outcomes. Our proposed generative doubly-robust learners (GDR-learners) are flexible and can be instantiated with many state-of-the-art deep generative models. In particular, we develop GDR-learners based on (a) conditional normalizing flows (which we call GDR-CNFs), (b) conditional generative adversarial networks (GDR-CGANs), (c) conditional variational autoencoders (GDR-CVAEs), and (d) conditional diffusion models (GDR-CDMs). Unlike the existing methods, our GDR-learners possess the properties of quasi-oracle efficiency and rate double robustness, and are thus asymptotically optimal. In a series of (semi-)synthetic experiments, we demonstrate that our GDR-learners are very effective and outperform the existing methods in estimating the conditional distributions of potential outcomes.
翻译:各种深度生成模型已被提出,用于从观测数据中估计潜在结果的分布。然而,这些模型均不具备一般性奈曼正交性这一有利的理论性质,以及与之相关的拟预言机效率和双重稳健性。本文引入了一套通用的生成式奈曼正交(双重稳健)学习器,用于估计潜在结果的条件分布。我们提出的生成式双重稳健学习器(GDR-学习器)具有灵活性,可以用许多最先进的深度生成模型进行实例化。具体而言,我们开发了基于以下模型的GDR-学习器:(a)条件归一化流(我们称之为GDR-CNFs),(b)条件生成对抗网络(GDR-CGANs),(c)条件变分自编码器(GDR-CVAEs),以及(d)条件扩散模型(GDR-CDMs)。与现有方法不同,我们的GDR-学习器具备拟预言机效率和速率双重稳健性的性质,因此是渐近最优的。在一系列(半)合成实验中,我们证明了我们的GDR-学习器在估计潜在结果的条件分布方面非常有效,并且优于现有方法。