We investigate learning the eigenfunctions of evolution operators for time-reversal invariant stochastic processes, a prime example being the Langevin equation used in molecular dynamics. Many physical or chemical processes described by this equation involve transitions between metastable states separated by high potential barriers that can hardly be crossed during a simulation. To overcome this bottleneck, data are collected via biased simulations that explore the state space more rapidly. We propose a framework for learning from biased simulations rooted in the infinitesimal generator of the process and the associated resolvent operator. We contrast our approach to more common ones based on the transfer operator, showing that it can provably learn the spectral properties of the unbiased system from biased data. In experiments, we highlight the advantages of our method over transfer operator approaches and recent developments based on generator learning, demonstrating its effectiveness in estimating eigenfunctions and eigenvalues. Importantly, we show that even with datasets containing only a few relevant transitions due to sub-optimal biasing, our approach recovers relevant information about the transition mechanism.
翻译:我们研究学习时间可逆随机过程演化算子的特征函数,一个典型例子是分子动力学中使用的朗之万方程。该方程描述的许多物理或化学过程涉及亚稳态之间的跃迁,这些态被高势垒分隔,在模拟中几乎无法跨越。为克服这一瓶颈,通常通过有偏模拟收集数据以更快地探索状态空间。我们提出一个基于过程无穷小生成元及相关预解算子的有偏模拟学习框架。我们将该方法与基于转移算子的常见方法进行对比,证明其可从有偏数据中可证明地学习无偏系统的谱特性。在实验中,我们展示了本方法相对于转移算子方法及基于生成元学习的最新进展的优势,证明了其在估计特征函数和特征值方面的有效性。重要的是,我们表明即使数据集因次优偏置仅包含少量相关跃迁,本方法仍能恢复关于跃迁机制的相关信息。