Molecular dynamics (MD) simulation is computationally demanding, particularly for large-scale systems requiring long-term analysis. Accurate forecast of the outcomes of a MD simulation is not only an attractive scientific challenge but also has substantial practical value. In this work, we developed a data-driven framework, termed ASTEROID (Advanced Spatiotemporal TransformER fOr Inferring Dynamics), that can directly predict multi-step atomic coordinates, avoiding conventional iterative integration. For this purpose, our ASTEROID reformulates MD trajectories as high-dimensional spatiotemporal sequences and integrates the Spatiotemporal Information (STI) Transformation equation into a Transformer architecture. The core innovation of ASTEROID lies in its ability to model multiscale spatiotemporal dependencies. In particular, for spatial dependencies, a local-global self-attention mechanism captures both short- and long-range interactions. For temporal dependencies, an encoder-decoder structure integrates global context with autoregressive forecasting. ASTEROID was evaluated on several quantum-mechanics derived molecular datasets. Our results indicate that ASTEROID achieved not only a higher level of accuracy in multi-step prediction than existing methods on various benchmarks, but also significantly reduced computational cost of conventional MD simulation. Moreover, the model supports iterative multi-step forecasting over an extended time scale. This work establishes a robust and generalizable data-driven paradigm for accelerating MD simulations.
翻译:分子动力学模拟计算成本高昂,尤其是对于需要长期分析的大规模系统。精确预测分子动力学模拟结果不仅具有吸引力的科学挑战性,也具有重要的实际价值。在本工作中,我们开发了一种名为ASTEROID(高级时空变换器用于推断动力学)的数据驱动框架,该框架可以直接预测多步原子坐标,从而避免传统的迭代积分。为此,我们的ASTEROID将分子动力学轨迹重构为高维时空序列,并将时空信息变换方程集成到Transformer架构中。ASTEROID的核心创新在于其建模多尺度时空依赖关系的能力。具体而言,对于空间依赖关系,一种局部-全局自注意力机制能够同时捕捉短程和长程相互作用;对于时间依赖关系,编码器-解码器结构将全局上下文与自回归预测相结合。ASTEROID在多个量子力学导出的分子数据集上进行了评估。结果表明,ASTEROID不仅在多个基准测试中实现了比现有方法更高的多步预测精度,还显著降低了传统分子动力学模拟的计算成本。此外,该模型支持在扩展时间尺度上的迭代多步预测。本工作建立了一种稳健且可推广的数据驱动范式,用于加速分子动力学模拟。