Understanding transition paths between meta-stable states in molecular systems is fundamental for material design and drug discovery. However, sampling these paths via molecular dynamics simulations is computationally prohibitive due to the high-energy barriers between the meta-stable states. Recent machine learning approaches are often restricted to simple systems or rely on collective variables (CVs) extracted from expensive domain knowledge. In this work, we propose to leverage generative flow networks (GFlowNets) to sample transition paths without relying on CVs. We reformulate the problem as amortized energy-based sampling over molecular trajectories and train a bias potential by minimizing the squared log-ratio between the target distribution and the generator, derived from the flow matching objective of GFlowNets. Our evaluation on three proteins (Alanine Dipeptide, Polyproline, and Chignolin) demonstrates that our approach, called TPS-GFN, generates more realistic and diverse transition paths than the previous CV-free machine learning approach.
翻译:理解分子系统中亚稳态之间的过渡路径对于材料设计和药物发现至关重要。然而,由于亚稳态之间存在高能垒,通过分子动力学模拟对这些路径进行采样在计算上是难以实现的。近期的机器学习方法通常局限于简单系统,或依赖于从昂贵的领域知识中提取的集体变量。在本工作中,我们提出利用生成流网络来采样过渡路径,而无需依赖集体变量。我们将该问题重新表述为对分子轨迹的摊销能量基采样,并通过最小化目标分布与生成器之间的对数平方比来训练偏置势,该目标源自生成流网络的流匹配目标。我们在三种蛋白质(丙氨酸二肽、聚脯氨酸和Chignolin)上的评估表明,我们称为TPS-GFN的方法,相比先前的无集体变量机器学习方法,能生成更真实且更多样化的过渡路径。