Motivated by the burgeoning interest in cross-domain learning, we present a novel generative modeling challenge: generating counterfactual samples in a target domain based on factual observations from a source domain. Our approach operates within an unsupervised paradigm devoid of parallel or joint datasets, relying exclusively on distinct observational samples and causal graphs for each domain. This setting presents challenges that surpass those of conventional counterfactual generation. Central to our methodology is the disambiguation of exogenous causes into effect-intrinsic and domain-intrinsic categories. This differentiation facilitates the integration of domain-specific causal graphs into a unified joint causal graph via shared effect-intrinsic exogenous variables. We propose leveraging Neural Causal models within this joint framework to enable accurate counterfactual generation under standard identifiability assumptions. Furthermore, we introduce a novel loss function that effectively segregates effect-intrinsic from domain-intrinsic variables during model training. Given a factual observation, our framework combines the posterior distribution of effect-intrinsic variables from the source domain with the prior distribution of domain-intrinsic variables from the target domain to synthesize the desired counterfactuals, adhering to Pearl's causal hierarchy. Intriguingly, when domain shifts are restricted to alterations in causal mechanisms without accompanying covariate shifts, our training regimen parallels the resolution of a conditional optimal transport problem. Empirical evaluations on a synthetic dataset show that our framework generates counterfactuals in the target domain that very closely resemble the ground truth.
翻译:受跨域学习日益增长的关注所启发,我们提出一种新颖的生成建模挑战:基于源域的事实观测,在目标域中生成反事实样本。我们的方法在无监督范式下运行,无需平行或联合数据集,仅依赖各域独立的观测样本和因果图。这一设定带来的挑战超越了传统的反事实生成任务。我们方法的核心在于将外生原因解耦为效应内禀和域内禀两类。这种区分通过共享的效应内禀外生变量,促进了将特定域因果图整合为统一联合因果图。我们建议在此联合框架中利用神经因果模型,以在标准可识别性假设下实现精确的反事实生成。此外,我们提出一种新颖的损失函数,能在模型训练过程中有效分离效应内禀变量与域内禀变量。给定一个事实观测,我们的框架将源域效应内禀变量的后验分布与目标域域内禀变量的先验分布相结合,以合成符合 Pearl 因果层次理论的反事实样本。有趣的是,当域偏移仅限于因果机制的变化而不伴随协变量偏移时,我们的训练过程等价于求解一个条件最优传输问题。在合成数据集上的实证评估表明,我们的框架生成的目标域反事实样本与真实情况高度吻合。