We consider the problem of sampling transition paths between two given metastable states of a molecular system, e.g. a folded and unfolded protein or products and reactants of a chemical reaction. Due to the existence of high energy barriers separating the states, these transition paths are unlikely to be sampled with standard Molecular Dynamics (MD) simulation. Traditional methods to augment MD with a bias potential to increase the probability of the transition rely on a dimensionality reduction step based on Collective Variables (CVs). Unfortunately, selecting appropriate CVs requires chemical intuition and traditional methods are therefore not always applicable to larger systems. Additionally, when incorrect CVs are used, the bias potential might not be minimal and bias the system along dimensions irrelevant to the transition. Showing a formal relation between the problem of sampling molecular transition paths, the Schr\"odinger bridge problem and stochastic optimal control with neural network policies, we propose a machine learning method for sampling said transitions. Unlike previous non-machine learning approaches our method, named PIPS, does not depend on CVs. We show that our method successful generates low energy transitions for Alanine Dipeptide as well as the larger Polyproline and Chignolin proteins.
翻译:本文考虑对分子系统(例如折叠/未折叠蛋白质或化学反应产物/反应物)中两个给定亚稳态之间过渡路径的采样问题。由于状态间存在高能垒,此类过渡路径难以通过标准分子动力学模拟采样。传统方法通过引入偏置势增强MD模拟的过渡概率,需基于集体变量进行降维。然而,选择恰当的CV需依赖化学直觉,因此传统方法不适用于大型系统。此外,若使用错误的CV,偏置势可能无法达到最小化,且会沿与过渡无关的维度偏置系统。通过建立分子过渡路径采样问题、薛定谔桥问题及基于神经网络策略的随机最优控制之间的形式化关联,我们提出了一种机器学习方法用于采样上述过渡路径。与现有非机器学习方法不同,我们的方法(命名为PIPS)不依赖于CV。实验表明,该方法能够成功生成丙氨酸二肽、较大分子聚脯氨酸及Chignolin蛋白的低能过渡路径。