Estimating causal quantities traditionally relies on bespoke estimators tailored to specific assumptions. Recently proposed Causal Foundation Models (CFMs) promise a more unified approach by amortising causal discovery and inference in a single step. However, in their current state, they do not allow for the incorporation of any domain knowledge, which can lead to suboptimal predictions. We bridge this gap by introducing methods to condition CFMs on causal information, such as the causal graph or more readily available ancestral information. When access to complete causal graph information is too strict a requirement, our approach also effectively leverages partial causal information. We systematically evaluate conditioning strategies and find that injecting learnable biases into the attention mechanism is the most effective method to utilise full and partial causal information. Our experiments show that this conditioning allows a general-purpose CFM to match the performance of specialised models trained on specific causal structures. Overall, our approach addresses a central hurdle on the path towards all-in-one causal foundation models: the capability to answer causal queries in a data-driven manner while effectively leveraging any amount of domain expertise.
翻译:传统上,估计因果量依赖于针对特定假设定制的专用估计器。最近提出的因果基础模型(CFMs)通过将因果发现与推断合并为单一步骤进行摊销,有望提供一种更统一的方法。然而,在现有状态下,它们不允许融入任何领域知识,这可能导致次优的预测。我们通过引入在因果信息(如因果图或更易获取的祖先信息)上对CFMs进行条件化的方法,来弥合这一差距。当获取完整因果图信息的要求过于严格时,我们的方法也能有效利用部分因果信息。我们系统地评估了条件化策略,发现将可学习的偏置注入注意力机制是利用完整和部分因果信息的最有效方法。实验表明,这种条件化使得通用CFM能够达到在特定因果结构上训练的专用模型的性能。总体而言,我们的方法解决了通向一体化因果基础模型道路上的一个核心障碍:在有效利用任意数量领域专业知识的同时,以数据驱动的方式回答因果查询的能力。