One of the goals of causal inference is to generalize from past experiments and observational data to novel conditions. While it is in principle possible to eventually learn a mapping from a novel experimental condition to an outcome of interest, provided a sufficient variety of experiments is available in the training data, coping with a large combinatorial space of possible interventions is hard. Under a typical sparse experimental design, this mapping is ill-posed without relying on heavy regularization or prior distributions. Such assumptions may or may not be reliable, and can be hard to defend or test. In this paper, we take a close look at how to warrant a leap from past experiments to novel conditions based on minimal assumptions about the factorization of the distribution of the manipulated system, communicated in the well-understood language of factor graph models. A postulated $\textit{interventional factor model}$ (IFM) may not always be informative, but it conveniently abstracts away a need for explicit unmeasured confounding and feedback mechanisms, leading to directly testable claims. We derive necessary and sufficient conditions for causal effect identifiability with IFMs using data from a collection of experimental settings, and implement practical algorithms for generalizing expected outcomes to novel conditions never observed in the data.
翻译:因果推断的目标之一是从过去的实验和观测数据中泛化到新条件。虽然原则上可以通过训练数据中足够多样的实验,最终学习从新实验条件到目标结果的映射,但应对可能的干预组合空间是困难的。在典型的稀疏实验设计下,若无重度正则化或先验分布依赖,该映射是病态的。此类假设可能可靠也可能不可靠,且难以论证或检验。本文深入探究了如何基于对受操作系统分布因子分解的最小假设——通过易理解的因子图模型语言传达——实现从过去实验到新条件的跃迁。假定的$\textit{干预因子模型}$(IFM)虽非总能提供信息,但简洁地抽象了对显式未测量混杂和反馈机制的需求,从而产生可直接检验的论断。我们推导了利用多实验设置数据在IFM下实现因果效应可识别性的充要条件,并实现了将期望结果泛化至数据中从未观测的新条件的实用算法。