Flow-based generative models have been employed for sampling the Boltzmann distribution, but their application to high-dimensional systems is hindered by the significant computational cost of obtaining the Jacobian of the flow. To overcome this challenge, we introduce the flow perturbation method, which incorporates optimized stochastic perturbations into the flow. By reweighting trajectories generated by the perturbed flow, our method achieves unbiased sampling of the Boltzmann distribution with orders of magnitude speedup compared to both brute force Jacobian calculations and the Hutchinson estimator. Notably, it accurately sampled the Chignolin protein with all atomic Cartesian coordinates explicitly represented, which, to our best knowledge, is the largest molecule ever Boltzmann sampled in such detail using generative models.
翻译:基于流的生成模型已被用于玻尔兹曼分布的采样,但其在高维系统中的应用受到获取流的雅可比矩阵巨大计算成本的阻碍。为克服这一挑战,我们引入了流扰动方法,该方法将优化的随机扰动整合到流中。通过对扰动流生成的轨迹进行重加权,我们的方法实现了对玻尔兹曼分布的无偏采样,其速度相比暴力雅可比计算和Hutchinson估计器提升了数个数量级。值得注意的是,该方法成功采样了明确表示所有原子笛卡尔坐标的Chignolin蛋白,据我们所知,这是迄今使用生成模型以如此精细程度进行玻尔兹曼采样的最大分子。