Extracting consistent statistics between relevant free-energy minima of a molecular system is essential for physics, chemistry and biology. Molecular dynamics (MD) simulations can aid in this task but are computationally expensive, especially for systems that require quantum accuracy. To overcome this challenge, we develop an approach combining enhanced sampling with deep generative models and active learning of a machine learning potential (MLP). We introduce an adaptive Markov chain Monte Carlo framework that enables the training of one Normalizing Flow (NF) and one MLP per state. We simulate several Markov chains in parallel until they reach convergence, sampling the Boltzmann distribution with an efficient use of energy evaluations. At each iteration, we compute the energy of a subset of the NF-generated configurations using Density Functional Theory (DFT), we predict the remaining configuration's energy with the MLP and actively train the MLP using the DFT-computed energies. Leveraging the trained NF and MLP models, we can compute thermodynamic observables such as free-energy differences or optical spectra. We apply this method to study the isomerization of an ultrasmall silver nanocluster, belonging to a set of systems with diverse applications in the fields of medicine and catalysis.
翻译:提取分子系统相关自由能极小值之间的一致统计量对于物理学、化学和生物学至关重要。分子动力学(MD)模拟可辅助完成此任务,但计算成本高昂,尤其对于需要量子精度的系统。为克服这一挑战,我们开发了一种结合增强采样、深度生成模型以及机器学习势(MLP)主动学习的方法。我们引入自适应马尔可夫链蒙特卡洛框架,可为每个状态训练一个归一化流(NF)和一个MLP。我们并行模拟多条马尔可夫链直至收敛,在高效利用能量评估的情况下采样玻尔兹曼分布。每次迭代中,我们使用密度泛函理论(DFT)计算NF生成构型子集的能量,利用MLP预测剩余构型的能量,并基于DFT计算的能量主动训练MLP。借助训练好的NF和MLP模型,我们可以计算热力学可观测量,如自由能差或光学光谱。我们将该方法应用于研究一种超小型银纳米簇的异构化过程,该体系属于在医学和催化领域具有多样化应用的一类系统。