The field of geometric deep learning has had a profound impact on the development of innovative and powerful graph neural network architectures. Disciplines such as computer vision and computational biology have benefited significantly from such methodological advances, which has led to breakthroughs in scientific domains such as protein structure prediction and design. In this work, we introduce GCPNet, a new geometry-complete, SE(3)-equivariant graph neural network designed for 3D molecular graph representation learning. Rigorous experiments across four distinct geometric tasks demonstrate that GCPNet's predictions (1) for protein-ligand binding affinity achieve a statistically significant correlation of 0.608, more than 5% greater than current state-of-the-art methods; (2) for protein structure ranking achieve statistically significant target-local and dataset-global correlations of 0.616 and 0.871, respectively; (3) for Newtownian many-body systems modeling achieve a task-averaged mean squared error less than 0.01, more than 15% better than current methods; and (4) for molecular chirality recognition achieve a state-of-the-art prediction accuracy of 98.7%, better than any other machine learning method to date. The source code, data, and instructions to train new models or reproduce our results are freely available at https://github.com/BioinfoMachineLearning/GCPNet.
翻译:几何深度学习领域对创新且强大的图神经网络架构发展产生了深远影响。计算机视觉和计算生物学等学科显著受益于此类方法论进展,从而在蛋白质结构预测与设计等科学领域取得了突破性成果。本文提出GCPNet——一种专为三维分子图表征学习设计的几何完备、SE(3)等变图神经网络。在四个不同几何任务上的严格实验表明,GCPNet的预测:(1) 蛋白质-配体结合亲和力达到0.608的统计显著相关性,比当前最先进方法提高超过5%;(2) 蛋白质结构排序分别实现0.616和0.871的统计显著目标局部和数据集全局相关性;(3) 牛顿多体系统建模的任务平均均方误差低于0.01,比当前方法改善超过15%;(4) 分子手性识别达到98.7%的最先进预测准确率,优于迄今所有其他机器学习方法。用于训练新模型或复现结果的源代码、数据及说明可免费获取于https://github.com/BioinfoMachineLearning/GCPNet。