In drug discovery, molecular dynamics (MD) simulation for protein-ligand binding provides a powerful tool for predicting binding affinities, estimating transport properties, and exploring pocket sites. There has been a long history of improving the efficiency of MD simulations through better numerical methods and, more recently, by augmenting them with machine learning (ML) methods. Yet, challenges remain, such as accurate modeling of extended-timescale simulations. To address this issue, we propose NeuralMD, the first ML surrogate that can facilitate numerical MD and provide accurate simulations of protein-ligand binding dynamics. We propose a principled approach that incorporates a novel physics-informed multi-grained group symmetric framework. Specifically, we propose (1) a BindingNet model that satisfies group symmetry using vector frames and captures the multi-level protein-ligand interactions, and (2) an augmented neural differential equation solver that learns the trajectory under Newtonian mechanics. For the experiment, we design ten single-trajectory and three multi-trajectory binding simulation tasks. We show the efficiency and effectiveness of NeuralMD, with a 2000$\times$ speedup over standard numerical MD simulation and outperforming all other ML approaches by up to 80\% under the stability metric. We further qualitatively show that NeuralMD reaches more stable binding predictions compared to other machine learning methods.
翻译:在药物发现中,针对蛋白质-配体结合的分子动力学(MD)模拟为预测结合亲和力、估算输运性质以及探索口袋位点提供了强大工具。通过改进数值方法,以及近期结合机器学习(ML)方法,提升MD模拟效率已有长期研究历史。然而,仍存在挑战,例如对长时间尺度模拟的精确建模。为解决这一问题,我们提出NeuralMD,这是首个能够辅助数值MD并进行蛋白质-配体结合动力学精确模拟的ML替代模型。我们提出了一种原则性方法,融合了基于物理信息的多粒度群对称框架。具体而言,我们提出:(1)BindingNet模型,利用向量框架满足群对称性并捕获多级蛋白质-配体相互作用;(2)增强型神经常微分方程求解器,在牛顿力学框架下学习轨迹。实验方面,我们设计了十项单轨迹和三项多轨迹结合模拟任务。我们展示了NeuralMD的效率与有效性,其标准数值MD模拟速度提升2000倍,且在稳定性指标上超越所有其他ML方法高达80%。我们进一步定性表明,与其他机器学习方法相比,NeuralMD能实现更稳定的结合预测。