Recently, molecular relational learning, whose goal is to predict the interaction behavior between molecular pairs, got a surge of interest in molecular sciences due to its wide range of applications. In this work, we propose CMRL that is robust to the distributional shift in molecular relational learning by detecting the core substructure that is causally related to chemical reactions. To do so, we first assume a causal relationship based on the domain knowledge of molecular sciences and construct a structural causal model (SCM) that reveals the relationship between variables. Based on the SCM, we introduce a novel conditional intervention framework whose intervention is conditioned on the paired molecule. With the conditional intervention framework, our model successfully learns from the causal substructure and alleviates the confounding effect of shortcut substructures that are spuriously correlated to chemical reactions. Extensive experiments on various tasks with real-world and synthetic datasets demonstrate the superiority of CMRL over state-of-the-art baseline models. Our code is available at https://github.com/Namkyeong/CMRL.
翻译:近年来,分子关系学习——旨在预测分子对间交互行为——因其广泛应用在分子科学领域引起了研究热潮。本文提出CMRL模型,通过检测与化学反应存在因果关联的核心子结构,实现了对分子关系学习中分布偏移的鲁棒性。为此,我们首先基于分子科学的领域知识假设因果关系,构建揭示变量间关系的结构因果模型(SCM)。基于该SCM,我们提出了一种新颖的条件干预框架,其干预操作以配对分子为条件。借助该条件干预框架,模型成功从因果子结构中学习,并减轻了与化学反应存在虚假相关的捷径子结构的混杂效应。在真实与合成数据集上开展的多种任务的大量实验表明,CMRL显著优于当前最优基线模型。我们的代码开源在https://github.com/Namkyeong/CMRL。