AI-driven drug design relies significantly on predicting molecular properties, which is a complex task. In current approaches, the most commonly used feature representations for training deep neural network models are based on SMILES and molecular graphs. While these methods are concise and efficient, they have limitations in capturing complex spatial information. Recently, researchers have recognized the importance of incorporating three-dimensional information of molecular structures into models. However, capturing spatial information requires the introduction of additional units in the generator, bringing additional design and computational costs. Therefore, it is necessary to develop a method for predicting molecular properties that effectively combines spatial structural information while maintaining the simplicity and efficiency of graph neural networks. In this work, we propose an embedding approach CTAGE, utilizing $k$-hop discrete Ricci curvature to extract structural insights from molecular graph data. This effectively integrates spatial structural information while preserving the training complexity of the network. Experimental results indicate that introducing node curvature significantly improves the performance of current graph neural network frameworks, validating that the information from k-hop node curvature effectively reflects the relationship between molecular structure and function.
翻译:AI驱动的药物设计高度依赖于分子性质的预测,这是一项复杂的任务。当前方法中,最常用的用于训练深度神经网络模型的特征表示基于SMILES和分子图。虽然这些方法简洁高效,但在捕捉复杂空间信息方面存在局限性。近期,研究人员认识到将分子结构的三维信息融入模型的重要性。然而,捕捉空间信息需要在生成器中引入额外单元,从而带来额外的设计和计算成本。因此,有必要开发一种既能有效结合空间结构信息,又能保持图神经网络简洁高效的分子性质预测方法。在本工作中,我们提出了一种嵌入方法CTAGE,利用$k$-跳离散里奇曲率从分子图数据中提取结构信息。该方法在保持网络训练复杂度的同时,有效整合了空间结构信息。实验结果表明,引入节点曲率显著提升了当前图神经网络框架的性能,验证了k跳节点曲率信息能有效反映分子结构与功能之间的关系。