In scientific domains -- from biology to the social sciences -- many questions boil down to \textit{What effect will we observe if we intervene on a particular variable?} If the causal relationships (e.g.~a causal graph) are known, it is possible to estimate the intervention distributions. In the absence of this domain knowledge, the causal structure must be discovered from the available observational data. However, observational data are often compatible with multiple causal graphs, making methods that commit to a single structure prone to overconfidence. A principled way to manage this structural uncertainty is via Bayesian inference, which averages over a posterior distribution on possible causal structures and functional mechanisms. Unfortunately, the number of causal structures grows super-exponentially with the number of nodes in the graph, making computations intractable. We propose to circumvent these challenges by using meta-learning to create an end-to-end model: the Model-Averaged Causal Estimation Transformer Neural Process (MACE-TNP). The model is trained to predict the Bayesian model-averaged interventional posterior distribution, and its end-to-end nature bypasses the need for expensive calculations. Empirically, we demonstrate that MACE-TNP outperforms strong Bayesian baselines. Our work establishes meta-learning as a flexible and scalable paradigm for approximating complex Bayesian causal inference, that can be scaled to increasingly challenging settings in the future.
翻译:在从生物学到社会科学等科学领域中,许多问题最终可归结为:\textit{若对特定变量实施干预,我们将观察到何种效应?}若已知因果关系(例如因果图),则可估计干预分布。在缺乏此类领域知识时,必须从现有观测数据中推断因果结构。然而,观测数据往往与多种因果图兼容,导致依赖单一结构的方法容易产生过度自信。处理这种结构不确定性的理论方法是通过贝叶斯推断,即对可能的因果结构与功能机制的后验分布进行平均计算。但遗憾的是,因果图结构的数量随图中节点数呈超指数增长,使得计算难以处理。我们提出通过元学习构建端到端模型来规避这些挑战:模型平均因果估计Transformer神经过程(MACE-TNP)。该模型经训练可预测贝叶斯模型平均的干预后验分布,其端到端特性避免了昂贵计算的需求。实验表明,MACE-TNP优于强贝叶斯基线方法。我们的研究确立了元学习作为一种灵活可扩展的范式,可用于近似复杂的贝叶斯因果推断,未来可扩展至更具挑战性的应用场景。