Molecular dynamics (MD) is a powerful technique for studying microscopic phenomena, but its computational cost has driven significant interest in the development of deep learning-based surrogate models. We introduce generative modeling of molecular trajectories as a paradigm for learning flexible multi-task surrogate models of MD from data. By conditioning on appropriately chosen frames of the trajectory, we show such generative models can be adapted to diverse tasks such as forward simulation, transition path sampling, and trajectory upsampling. By alternatively conditioning on part of the molecular system and inpainting the rest, we also demonstrate the first steps towards dynamics-conditioned molecular design. We validate the full set of these capabilities on tetrapeptide simulations and show that our model can produce reasonable ensembles of protein monomers. Altogether, our work illustrates how generative modeling can unlock value from MD data towards diverse downstream tasks that are not straightforward to address with existing methods or even MD itself. Code is available at https://github.com/bjing2016/mdgen.
翻译:分子动力学(MD)是研究微观现象的有力技术,但其计算成本高昂,推动了人们对开发基于深度学习的替代模型的浓厚兴趣。我们引入分子轨迹的生成建模,作为一种从数据中学习灵活的多任务MD替代模型的范式。通过以轨迹中适当选择的帧为条件,我们证明此类生成模型可适用于多种任务,如前向模拟、过渡路径采样和轨迹上采样。通过以分子系统的部分为条件并修复其余部分,我们还展示了迈向动力学条件分子设计的第一步。我们在四肽模拟上验证了所有这些功能,并表明我们的模型能够生成合理的蛋白质单体集合。总之,我们的工作阐明了生成建模如何从MD数据中释放价值,以应对多种下游任务,这些任务用现有方法甚至MD本身都不易解决。代码可在 https://github.com/bjing2016/mdgen 获取。