Molecular docking is critical to structure-based virtual screening, yet the throughput of such workflows is limited by the expensive optimization of scoring functions involved in most docking algorithms. We explore how machine learning can accelerate this process by learning a scoring function with a functional form that allows for more rapid optimization. Specifically, we define the scoring function to be the cross-correlation of multi-channel ligand and protein scalar fields parameterized by equivariant graph neural networks, enabling rapid optimization over rigid-body degrees of freedom with fast Fourier transforms. The runtime of our approach can be amortized at several levels of abstraction, and is particularly favorable for virtual screening settings with a common binding pocket. We benchmark our scoring functions on two simplified docking-related tasks: decoy pose scoring and rigid conformer docking. Our method attains similar but faster performance on crystal structures compared to the widely-used Vina and Gnina scoring functions, and is more robust on computationally predicted structures. Code is available at https://github.com/bjing2016/scalar-fields.
翻译:分子对接对于基于结构的虚拟筛选至关重要,但此类工作流程的通量受限于大多数对接算法中评分函数的高昂优化成本。本文探索如何通过机器学习加速这一过程:学习具有特定函数形式的评分函数,从而支持更高效的优化。具体而言,我们将评分函数定义为由等变图神经网络参数化的多通道配体与蛋白质标量场的互相关函数,并利用快速傅里叶变换实现对刚体自由度的快速优化。本方法的运行时开销可在多个抽象层面进行分摊,尤其适用于具有共同结合口袋的虚拟筛选场景。我们基于两项简化的对接任务(诱饵构象评分与刚性构象体对接)对评分函数进行基准测试。结果表明,与广泛使用的Vina和Gnina评分函数相比,本方法在晶体结构上性能相当但速度更快,且对计算预测结构具有更强的鲁棒性。代码已开源至https://github.com/bjing2016/scalar-fields。