Sampling equilibrium molecular configurations from the Boltzmann distribution is a longstanding challenge. Boltzmann Generators (BGs) address this by combining exact-likelihood generative models with importance sampling, but their practical scalability is limited. Meanwhile, coarse-grained surrogates enable the modeling of larger systems by reducing effective dimensionality, yet often lack the reweighting process required to ensure asymptotically correct statistics. In this work, we propose Coarse-Grained Boltzmann Generators (CG-BGs), a principled framework that unifies scalable reduced-order modeling with the exactness of importance sampling. CG-BGs act in a coarse-grained coordinate space, using a learned potential of mean force (PMF) to reweight samples generated by a flow-based model. Crucially, we show that this PMF can be efficiently learned from rapidly converged data via force matching. Our results demonstrate that CG-BGs faithfully capture complex interactions mediated by explicit solvent within highly reduced representations, establishing a scalable pathway for the unbiased sampling of larger molecular systems.
翻译:从玻尔兹曼分布中采样平衡态分子构象是一个长期存在的挑战。玻尔兹曼生成器通过将精确似然生成模型与重要性采样相结合来解决这一问题,但其实际可扩展性有限。与此同时,粗粒度替代模型通过降低有效维度实现了对更大系统的建模,但通常缺乏确保渐近正确统计量所需的重加权过程。在本工作中,我们提出了粗粒度玻尔兹曼生成器,这是一个将可扩展的降阶建模与重要性采样的精确性相统一的原则性框架。CG-BGs在粗粒度坐标空间中运行,利用学习得到的平均力势通过基于流的模型生成样本并进行重加权。关键的是,我们证明了该PMF可以通过力匹配方法从快速收敛的数据中高效学习。我们的结果表明,CG-BGs能够在高度简化的表示中忠实捕捉由显式溶剂介导的复杂相互作用,从而为更大分子系统的无偏采样建立了一条可扩展的途径。