Energy-based models (EBMs) are known in the Machine Learning community for decades. Since the seminal works devoted to EBMs dating back to the noughties, there have been a lot of efficient methods which solve the generative modelling problem by means of energy potentials (unnormalized likelihood functions). In contrast, the realm of Optimal Transport (OT) and, in particular, neural OT solvers is much less explored and limited by few recent works (excluding WGAN-based approaches which utilize OT as a loss function and do not model OT maps themselves). In our work, we bridge the gap between EBMs and Entropy-regularized OT. We present a novel methodology which allows utilizing the recent developments and technical improvements of the former in order to enrich the latter. From the theoretical perspective, we prove generalization bounds for our technique. In practice, we validate its applicability in toy 2D and image domains. To showcase the scalability, we empower our method with a pre-trained StyleGAN and apply it to high-res AFHQ $512\times 512$ unpaired I2I translation. For simplicity, we choose simple short- and long-run EBMs as a backbone of our Energy-guided Entropic OT approach, leaving the application of more sophisticated EBMs for future research. Our code is available at: https://github.com/PetrMokrov/Energy-guided-Entropic-OT
翻译:能量基模型(EBMs)在机器学习领域已为人熟知数十年。自本世纪初关于EBMs的开创性工作以来,涌现出大量通过能量势(未归一化似然函数)解决生成建模问题的有效方法。相比之下,最优传输(OT)领域,特别是神经OT求解器,探索程度远不及前者,且仅限于近期少数研究工作(不包括利用OT作为损失函数但未直接建模OT映射的WGAN方法)。在本工作中,我们弥合了EBMs与熵正则化OT之间的鸿沟。我们提出了一种新颖的方法论,能够利用前者领域的最新进展与技术改进来丰富后者。从理论视角出发,我们证明了该技术的泛化界。在实践中,我们在二维玩具数据和图像领域验证了其适用性。为展示可扩展性,我们利用预训练的StyleGAN增强方法,并将其应用于高分辨率AFHQ 512×512非配对图像到图像翻译任务中。为简化实现,我们选择简单的短程与长程EBMs作为能量引导熵OT方法的主干,将更复杂EBMs的应用留待未来研究。我们的代码见:https://github.com/PetrMokrov/Energy-guided-Entropic-OT