Coarse-grained (CG) molecular dynamics enables the study of biological processes at temporal and spatial scales that would be intractable at an atomistic resolution. However, accurately learning a CG force field remains a challenge. In this work, we leverage connections between score-based generative models, force fields and molecular dynamics to learn a CG force field without requiring any force inputs during training. Specifically, we train a diffusion generative model on protein structures from molecular dynamics simulations, and we show that its score function approximates a force field that can directly be used to simulate CG molecular dynamics. While having a vastly simplified training setup compared to previous work, we demonstrate that our approach leads to improved performance across several small- to medium-sized protein simulations, reproducing the CG equilibrium distribution, and preserving dynamics of all-atom simulations such as protein folding events.
翻译:粗粒化分子动力学能够在原子尺度难以企及的时间与空间尺度上研究生物过程。然而,准确学习粗粒化力场仍是一项挑战。在本工作中,我们利用基于分数的生成模型、力场与分子动力学之间的关联,在训练过程中无需任何力场输入即可学习粗粒化力场。具体而言,我们在分子动力学模拟得到的蛋白质结构上训练扩散生成模型,并证明其分数函数近似于一个可直接用于模拟粗粒化分子动力学的力场。与以往研究相比,我们的训练设置大幅简化,同时在多个中小型蛋白质模拟中展现出更优性能:既能复现粗粒化平衡分布,又能保留全原子模拟(如蛋白质折叠事件)的动力学特性。