We consider the problem of predicting perturbation effects via causal models. In many applications, it is a priori unknown which mechanisms of a system are modified by an external perturbation, even though the features of the perturbation are available. For example, in genomics, some properties of a drug may be known, but not their causal effects on the regulatory pathways of cells. We propose a generative intervention model (GIM) that learns to map these perturbation features to distributions over atomic interventions in a jointly-estimated causal model. Contrary to prior approaches, this enables us to predict the distribution shifts of unseen perturbation features while gaining insights about their mechanistic effects in the underlying data-generating process. On synthetic data and scRNA-seq drug perturbation data, GIMs achieve robust out-of-distribution predictions on par with unstructured approaches, while effectively inferring the underlying perturbation mechanisms, often better than other causal inference methods.
翻译:我们研究通过因果模型预测扰动效应的问题。在许多应用中,尽管可获得扰动的特征,但系统哪些机制会受到外部扰动影响是事先未知的。例如在基因组学中,药物的某些特性可能已知,但其对细胞调控通路的因果效应尚不明确。我们提出一种生成式干预模型(GIM),该模型学习将这些扰动特征映射到联合估计的因果模型中原子干预的分布上。与先前方法不同,这使得我们能够预测未见扰动特征的分布偏移,同时深入理解其在底层数据生成过程中的机制效应。在合成数据和scRNA-seq药物扰动数据上的实验表明,GIM在分布外预测方面达到了与非结构化方法相当的稳健性能,同时能有效推断底层扰动机制,其表现通常优于其他因果推断方法。