Recently, graph neural networks (GNNs) have been successfully applied to predicting molecular properties, which is one of the most classical cheminformatics tasks with various applications. Despite their effectiveness, we empirically observe that training a single GNN model for diverse molecules with distinct structural patterns limits its prediction performance. In this paper, motivated by this observation, we propose \proposed to leverage topology-specific prediction models (referred to as experts), each of which is responsible for each molecular group sharing similar topological semantics. That is, each expert learns topology-specific discriminative features while being trained with its corresponding topological group. To tackle the key challenge of grouping molecules by their topological patterns, we introduce a clustering-based gating module that assigns an input molecule into one of the clusters and further optimizes the gating module with two different types of self-supervision: topological semantics induced by GNNs and molecular scaffolds, respectively. Extensive experiments demonstrate that \proposed has boosted the performance for molecular property prediction and also achieved better generalization for new molecules with unseen scaffolds than baselines. The code is available at https://github.com/kimsu55/ToxExpert.
翻译:近期,图神经网络(GNN)已成功应用于分子性质预测,这是最经典的化学信息学任务之一,具有广泛的应用场景。尽管其有效性,我们通过实验观察到,针对具有不同结构模式的多样化分子训练单一GNN模型会限制其预测性能。受此观察启发,本文提出利用拓扑特异性预测模型(称为专家),每个模型负责处理共享相似拓扑语义的分子组。即,每位专家学习拓扑特异性判别特征,并与其对应的拓扑组进行协同训练。为解决按拓扑模式对分子分组的核心挑战,我们引入一个基于聚类的门控模块,将输入分子分配至特定聚类,并通过两种不同类型的自监督(即GNN诱导的拓扑语义与分子骨架)进一步优化该门控模块。大量实验表明,所提方法在分子性质预测任务上显著提升了性能,并在面对含有未知骨架的新分子时,表现出优于基线模型的泛化能力。相关代码已开源至:https://github.com/kimsu55/ToxExpert。