We consider the task of causal imputation, where we aim to predict the outcomes of some set of actions across a wide range of possible contexts. As a running example, we consider predicting how different drugs affect cells from different cell types. We study the index-only setting, where the actions and contexts are categorical variables with a finite number of possible values. Even in this simple setting, a practical challenge arises, since often only a small subset of possible action-context pairs have been studied. Thus, models must extrapolate to novel action-context pairs, which can be framed as a form of matrix completion with rows indexed by actions, columns indexed by contexts, and matrix entries corresponding to outcomes. We introduce a novel SCM-based model class, where the outcome is expressed as a counterfactual, actions are expressed as interventions on an instrumental variable, and contexts are defined based on the initial state of the system. We show that, under a linearity assumption, this setup induces a latent factor model over the matrix of outcomes, with an additional fixed effect term. To perform causal prediction based on this model class, we introduce simple extension to the Synthetic Interventions estimator (Agarwal et al., 2020). We evaluate several matrix completion approaches on the PRISM drug repurposing dataset, showing that our method outperforms all other considered matrix completion approaches.
翻译:我们考虑因果插补任务,旨在预测一系列行动在不同可能情境下的结果。以预测不同药物对不同细胞类型的影响为例,我们研究仅含索引的场景,其中行动和情境均为有限取值的分类变量。即便在这种简单场景下,实际挑战依然存在——通常仅有小部分行动-情境组合被研究过。因此模型需对未见过的行动-情境组合进行外推,这可被建模为矩阵补全问题:行索引对应行动,列索引对应情境,矩阵元素对应结果。我们提出新型SCM模型类,将结果表示为反事实量,行动表示为对工具变量的干预,情境由系统初始状态定义。研究表明,在线性假设下,该框架在结果矩阵上诱导出带固定效应项的潜在因子模型。为基于此类模型进行因果预测,我们引入合成干预估计器(Agarwal等,2020)的简易扩展。在PRISM药物重定位数据集上评估多种矩阵补全方法,结果表明我们的方法优于所有其他对比方法。