Current state-of-the-art generative models map noise to data distributions by matching flows or scores. A key limitation of these models is their inability to readily integrate available partial observations and additional priors. In contrast, energy-based models (EBMs) address this by incorporating corresponding scalar energy terms. Here, we propose Energy Matching, a framework that endows flow-based approaches with the flexibility of EBMs. Far from the data manifold, samples move from noise to data along irrotational, optimal transport paths. As they approach the data manifold, an entropic energy term guides the system into a Boltzmann equilibrium distribution, explicitly capturing the underlying likelihood structure of the data. We parameterize these dynamics with a single time-independent scalar field, which serves as both a powerful generator and a flexible prior for effective regularization of inverse problems. The present method substantially outperforms existing EBMs on CIFAR-10 and ImageNet generation in terms of fidelity, while retaining simulation-free training of transport-based approaches away from the data manifold. Furthermore, we leverage the flexibility of the method to introduce an interaction energy that supports the exploration of diverse modes, which we demonstrate in a controlled protein generation setting. This approach learns a scalar potential energy, without time conditioning, auxiliary generators, or additional networks, marking a significant departure from recent EBM methods. We believe this simplified yet rigorous formulation significantly advances EBMs capabilities and paves the way for their wider adoption in generative modeling in diverse domains.
翻译:当前最先进的生成模型通过匹配流或得分将噪声映射到数据分布。这些模型的一个关键局限在于无法直接整合可用的部分观测与额外先验。相比之下,基于能量的模型通过引入相应的标量能量项来解决这一问题。本文提出能量匹配框架,使基于流的方法具备EBMs的灵活性。在远离数据流形时,样本沿无旋最优传输路径从噪声向数据移动。当样本接近数据流形时,一个熵能量项引导系统进入玻尔兹曼平衡分布,显式捕获数据的底层似然结构。我们用时不变标量场参数化这些动力学过程,该标量场既可作为强大的生成器,又可作为灵活先验来有效正则化逆问题。本方法在CIFAR-10和ImageNet生成任务上,在保真度方面显著优于现有EBMs,同时保留了基于传输方法在数据流形之外的免模拟训练特性。此外,我们利用该方法的灵活性引入支持多模态探索的相互作用能量,并在受控蛋白质生成场景中验证了其有效性。该方法无需时间条件、辅助生成器或额外网络即可学习标量势能,这与近期EBM方法形成显著区别。我们相信这种简化而严谨的表述显著提升了EBMs的能力,并为在多样化领域的生成建模中更广泛采用铺平了道路。