Coarse-grained (CG) molecular dynamics enables the study of biological processes at temporal and spatial scales that would be intractable at an atomistic resolution. However, accurately learning a CG force field remains a challenge. In this work, we leverage connections between score-based generative models, force fields and molecular dynamics to learn a CG force field without requiring any force inputs during training. Specifically, we train a diffusion generative model on protein structures from molecular dynamics simulations, and we show that its score function approximates a force field that can directly be used to simulate CG molecular dynamics. While having a vastly simplified training setup compared to previous work, we demonstrate that our approach leads to improved performance across several small- to medium-sized protein simulations, reproducing the CG equilibrium distribution, and preserving dynamics of all-atom simulations such as protein folding events.
翻译:粗粒化分子动力学能够以原子分辨率难以企及的时间和空间尺度研究生物过程。然而,如何准确学习粗粒化力场仍是一项挑战。在本工作中,我们利用基于分数的生成模型、力场与分子动力学之间的内在联系,在不需训练过程中提供任何力输入的前提下,学习到一个粗粒化力场。具体而言,我们基于分子动力学模拟的蛋白质结构训练了一个扩散生成模型,并证明其分数函数可近似为直接用于模拟粗粒化分子动力学的力场。尽管训练流程相比先前工作大幅简化,我们证明该方法在多个中小型蛋白质模拟中可获得更优性能,既能复现粗粒化平衡分布,又能保持全原子模拟(如蛋白质折叠事件)的动力学特征。