Predicting how different interventions will causally affect a specific individual is important in a variety of domains such as personalized medicine, public policy, and online marketing. There are a large number of methods to predict the effect of an existing intervention based on historical data from individuals who received it. However, in many settings it is important to predict the effects of novel interventions (\emph{e.g.}, a newly invented drug), which these methods do not address. Here, we consider zero-shot causal learning: predicting the personalized effects of a novel intervention. We propose CaML, a causal meta-learning framework which formulates the personalized prediction of each intervention's effect as a task. CaML trains a single meta-model across thousands of tasks, each constructed by sampling an intervention, along with its recipients and nonrecipients. By leveraging both intervention information (\emph{e.g.}, a drug's attributes) and individual features~(\emph{e.g.}, a patient's history), CaML is able to predict the personalized effects of novel interventions that do not exist at the time of training. Experimental results on real world datasets in large-scale medical claims and cell-line perturbations demonstrate the effectiveness of our approach. Most strikingly, CaML's zero-shot predictions outperform even strong baselines trained directly on data from the test interventions.
翻译:预测不同干预措施如何因果性地影响特定个体,在个性化医疗、公共政策和在线营销等多个领域具有重要意义。目前已有大量方法基于接受干预个体的历史数据预测现有干预措施的效果。然而,在许多场景下需要预测新型干预(如新研发药物)的效果,而这正是现有方法无法解决的问题。本文提出了一种零样本因果学习框架:预测新型干预措施的个性化效果。我们提出CaML,这是一种因果元学习框架,将每种干预效果的个性化预测形式化为一个任务。CaML通过采样干预措施及其接受者与非接受者构建数千个任务,并训练单个元模型。通过同时利用干预信息(如药物属性)和个体特征(如患者病史),CaML能够预测训练阶段不存在的全新干预措施的个性化效果。在大规模医疗索赔数据和细胞系扰动数据集上的实验结果表明了该方法的有效性。最令人瞩目的是,CaML的零样本预测甚至优于直接在测试干预数据上训练的强基线方法。