Predicting how different interventions will causally affect a specific individual is important in a variety of domains such as personalized medicine, public policy, and online marketing. However, most existing causal methods cannot generalize to predicting the effects of previously unseen interventions (e.g., a newly invented drug), because they require data for individuals who received the intervention. Here, we consider zero-shot causal learning: predicting the personalized effects of novel, previously unseen interventions. To tackle this problem, we propose CaML, a causal meta-learning framework which formulates the personalized prediction of each intervention's effect as a task. Rather than training a separate model for each intervention, CaML trains as a single meta-model across thousands of tasks, each constructed by sampling an intervention and individuals who either did or did not receive it. By leveraging both intervention information (e.g., a drug's attributes) and individual features (e.g., a patient's history), CaML is able to predict the personalized effects of unseen interventions. Experimental results on real world datasets in large-scale medical claims and cell-line perturbations demonstrate the effectiveness of our approach. Most strikingly, CaML zero-shot predictions outperform even strong baselines which have direct access to data of considered target interventions.
翻译:预测不同干预措施如何因果影响特定个体在个性化医疗、公共政策和在线营销等多个领域具有重要意义。然而,现有的大多数因果方法无法泛化到预测从未见过的干预措施(例如新发明的药物)的效果,因为它们需要已接受该干预措施的个体的数据。在此,我们考虑零样本因果学习:预测新型、未见干预措施的个性化效果。为解决这一问题,我们提出CaML,一种因果元学习框架,该框架将每种干预措施效果的个性化预测形式化为一个任务。CaML并非为每种干预措施训练单独的模型,而是作为单一元模型跨数千个任务进行训练,每个任务通过采样一种干预措施以及接受或未接受该干预措施的个体构建。通过同时利用干预信息(例如药物的属性)和个体特征(例如患者的病史),CaML能够预测未见干预措施的个性化效果。在大型医疗索赔和细胞系扰动等真实世界数据集上的实验结果表明了我们方法的有效性。最引人注目的是,CaML的零样本预测甚至优于那些直接访问目标干预措施数据的强基线方法。