Molecular property prediction (MPP) is a fundamental but challenging task in the computer-aided drug discovery process. More and more recent works employ different graph-based models for MPP, which have made considerable progress in improving prediction performance. However, current models often ignore relationships between molecules, which could be also helpful for MPP. For this sake, in this paper we propose a graph structure learning (GSL) based MPP approach, called GSL-MPP. Specifically, we first apply graph neural network (GNN) over molecular graphs to extract molecular representations. Then, with molecular fingerprints, we construct a molecular similarity graph (MSG). Following that, we conduct graph structure learning on the MSG (i.e., molecule-level graph structure learning) to get the final molecular embeddings, which are the results of fusing both GNN encoded molecular representations and the relationships among molecules, i.e., combining both intra-molecule and inter-molecule information. Finally, we use these molecular embeddings to perform MPP. Extensive experiments on seven various benchmark datasets show that our method could achieve state-of-the-art performance in most cases, especially on classification tasks. Further visualization studies also demonstrate the good molecular representations of our method.
翻译:分子属性预测(MPP)是计算机辅助药物发现过程中的基础性但具有挑战性的任务。越来越多的近期工作采用不同的基于图的模型进行MPP,这些模型在提升预测性能方面取得了显著进展。然而,当前模型常常忽略分子间的相互关系,而这些关系可能对MPP同样有益。为此,本文提出一种基于图结构学习(GSL)的MPP方法,称为GSL-MPP。具体而言,我们首先在分子图上应用图神经网络(GNN)以提取分子表征。随后,利用分子指纹构建分子相似性图(MSG)。接着,我们在MSG上进行图结构学习(即分子级图结构学习),以得到最终的分子嵌入,这是融合GNN编码的分子表征与分子间关系(即同时结合分子内与分子间信息)的结果。最后,我们使用这些分子嵌入进行MPP。在七个不同基准数据集上的广泛实验表明,我们的方法在大多数情况下能够达到最先进的性能,尤其是在分类任务上。进一步的视觉化研究也证明了我们的方法具有良好的分子表征能力。