Predicting how different interventions will causally affect a specific individual is important in a variety of domains such as personalized medicine, public policy, and online marketing. There are a large number of methods to predict the effect of an existing intervention based on historical data from individuals who received it. However, in many settings it is important to predict the effects of novel interventions (e.g., a newly invented drug), which these methods do not address. Here, we consider zero-shot causal learning: predicting the personalized effects of a novel intervention. We propose CaML, a causal meta-learning framework which formulates the personalized prediction of each intervention's effect as a task. CaML trains a single meta-model across thousands of tasks, each constructed by sampling an intervention, its recipients, and its nonrecipients. By leveraging both intervention information (e.g., a drug's attributes) and individual features~(e.g., a patient's history), CaML is able to predict the personalized effects of novel interventions that do not exist at the time of training. Experimental results on real world datasets in large-scale medical claims and cell-line perturbations demonstrate the effectiveness of our approach. Most strikingly, \method's zero-shot predictions outperform even strong baselines trained directly on data from the test interventions.
翻译:预测不同干预措施如何因果性地影响特定个体,在个性化医疗、公共政策和在线营销等多个领域具有重要意义。目前已有大量方法基于接受现有干预措施个体的历史数据来预测该干预的效果。然而,在许多场景中,预测新型干预措施(例如新研发的药物)的效果至关重要,而现有方法无法解决这一问题。本文研究了零样本因果学习:预测新型干预措施的个性化效果。我们提出了CaML,这是一个因果元学习框架,它将每种干预效果的个性化预测形式化为一个任务。CaML通过采样干预措施、其接受者与非接受者来构建数千个任务,并在这些任务上训练单个元模型。通过同时利用干预信息(例如药物的属性)和个体特征(例如患者的病史),CaML能够预测训练时不存在的新型干预措施的个性化效果。在大规模医疗索赔数据和细胞系扰动数据上的真实世界数据集实验结果证明了我们方法的有效性。最引人注目的是,CaML的零样本预测甚至优于直接基于测试干预数据训练的强基线方法。