Molecular Dynamics (MD) simulations provide a fundamental tool for characterizing molecular behavior at full atomic resolution, but their applicability is severely constrained by the computational cost. To address this, a surge of deep generative models has recently emerged to learn dynamics at coarsened timesteps for efficient trajectory generation, yet they either generalize poorly across systems or, due to limited molecular diversity of trajectory data, fail to fully exploit structural information to improve generative fidelity. Here, we present the Pretrained Variational Bridge (PVB) in an encoder-decoder fashion, which maps the initial structure into a noised latent space and transports it toward stage-specific targets through augmented bridge matching. This unifies training on both single-structure and paired trajectory data, enabling consistent use of cross-domain structural knowledge across training stages. Moreover, for protein-ligand complexes, we further introduce a reinforcement learning-based optimization via adjoint matching that speeds progression toward the holo state, which supports efficient post-optimization of docking poses. Experiments on proteins and protein-ligand complexes demonstrate that PVB faithfully reproduces thermodynamic and kinetic observables from MD while delivering stable and efficient generative dynamics.
翻译:分子动力学(MD)模拟为在全原子分辨率下表征分子行为提供了基础工具,但其应用受到计算成本的严重限制。为解决这一问题,近期涌现了大量深度生成模型,旨在学习粗粒度时间步长下的动力学以实现高效轨迹生成。然而,这些模型要么在不同系统间泛化能力较差,要么由于轨迹数据的分子多样性有限,未能充分利用结构信息来提升生成保真度。本文提出一种编码器-解码器架构的预训练变分桥模型,该模型将初始结构映射至加噪的隐空间,并通过增强的桥匹配将其向阶段特异性目标传输。该方法统一了对单结构数据和配对轨迹数据的训练,实现了跨训练阶段对跨域结构知识的一致性利用。此外,针对蛋白质-配体复合物,我们进一步引入基于强化学习的伴随匹配优化策略,以加速向全结合态演进,从而支持对接构象的高效后优化。在蛋白质及蛋白质-配体复合物上的实验表明,PVB能够准确复现MD模拟的热力学与动力学观测指标,同时提供稳定高效的生成动力学。