Energy Matching: Unifying Flow Matching and Energy-Based Models for Generative Modeling

Current state-of-the-art generative models map noise to data distributions by matching flows or scores. A key limitation of these models is their inability to readily integrate available partial observations and additional priors. In contrast, energy-based models (EBMs) address this by incorporating corresponding scalar energy terms. Here, we propose Energy Matching, a framework that endows flow-based approaches with the flexibility of EBMs. Far from the data manifold, samples move from noise to data along irrotational, optimal transport paths. As they approach the data manifold, an entropic energy term guides the system into a Boltzmann equilibrium distribution, explicitly capturing the underlying likelihood structure of the data. We parameterize these dynamics with a single time-independent scalar field, which serves as both a powerful generator and a flexible prior for effective regularization of inverse problems. The present method substantially outperforms existing EBMs on CIFAR-10 and ImageNet generation in terms of fidelity, while retaining simulation-free training of transport-based approaches away from the data manifold. Furthermore, we leverage the flexibility of the method to introduce an interaction energy that supports the exploration of diverse modes, which we demonstrate in a controlled protein generation setting. This approach learns a scalar potential energy, without time conditioning, auxiliary generators, or additional networks, marking a significant departure from recent EBM methods. We believe this simplified yet rigorous formulation significantly advances EBMs capabilities and paves the way for their wider adoption in generative modeling in diverse domains.

翻译：当前最先进的生成模型通过匹配流或得分将噪声映射到数据分布。这些模型的一个关键局限在于无法直接整合可用的部分观测与额外先验。相比之下，基于能量的模型通过引入相应的标量能量项来解决这一问题。本文提出能量匹配框架，使基于流的方法具备EBMs的灵活性。在远离数据流形时，样本沿无旋最优传输路径从噪声向数据移动。当样本接近数据流形时，一个熵能量项引导系统进入玻尔兹曼平衡分布，显式捕获数据的底层似然结构。我们用时不变标量场参数化这些动力学过程，该标量场既可作为强大的生成器，又可作为灵活先验来有效正则化逆问题。本方法在CIFAR-10和ImageNet生成任务上，在保真度方面显著优于现有EBMs，同时保留了基于传输方法在数据流形之外的免模拟训练特性。此外，我们利用该方法的灵活性引入支持多模态探索的相互作用能量，并在受控蛋白质生成场景中验证了其有效性。该方法无需时间条件、辅助生成器或额外网络即可学习标量势能，这与近期EBM方法形成显著区别。我们相信这种简化而严谨的表述显著提升了EBMs的能力，并为在多样化领域的生成建模中更广泛采用铺平了道路。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日