This paper studies the fundamental learning problem of the energy-based model (EBM). Learning the EBM can be achieved using the maximum likelihood estimation (MLE), which typically involves the Markov Chain Monte Carlo (MCMC) sampling, such as the Langevin dynamics. However, the noise-initialized Langevin dynamics can be challenging in practice and hard to mix. This motivates the exploration of joint training with the generator model where the generator model serves as a complementary model to bypass MCMC sampling. However, such a method can be less accurate than the MCMC and result in biased EBM learning. While the generator can also serve as an initializer model for better MCMC sampling, its learning can be biased since it only matches the EBM and has no access to empirical training examples. Such biased generator learning may limit the potential of learning the EBM. To address this issue, we present a joint learning framework that interweaves the maximum likelihood learning algorithm for both the EBM and the complementary generator model. In particular, the generator model is learned by MLE to match both the EBM and the empirical data distribution, making it a more informative initializer for MCMC sampling of EBM. Learning generator with observed examples typically requires inference of the generator posterior. To ensure accurate and efficient inference, we adopt the MCMC posterior sampling and introduce a complementary inference model to initialize such latent MCMC sampling. We show that three separate models can be seamlessly integrated into our joint framework through two (dual-) MCMC teaching, enabling effective and efficient EBM learning.
翻译:本文研究基于能量的模型(EBM)的基础学习问题。EBM的学习可通过最大似然估计(MLE)实现,该方法通常涉及马尔可夫链蒙特卡洛(MCMC)采样,例如朗之万动力学。然而,基于噪声初始化的朗之万动力学在实践中可能具有挑战性且难以混合。这促使人们探索与生成模型的联合训练,其中生成模型作为补充模型以绕过MCMC采样。然而,此类方法可能不如MCMC精确,并导致有偏的EBM学习。虽然生成模型也可作为初始化模型以改进MCMC采样,但其学习可能存在偏差,因为它仅匹配EBM而无法访问经验训练样本。这种有偏的生成模型学习可能限制EBM的学习潜力。为解决该问题,我们提出一种联合学习框架,将EBM与互补生成模型的最大似然学习算法交织在一起。具体而言,生成模型通过MLE学习以同时匹配EBM和经验数据分布,使其成为EBM的MCMC采样中更具信息量的初始化器。使用观测样本学习生成模型通常需要对生成模型后验进行推断。为确保准确高效的推断,我们采用MCMC后验采样,并引入互补推断模型以初始化此类潜在MCMC采样。我们证明,通过双MCMC教学,三个独立的模型可无缝集成到我们的联合框架中,从而实现高效且有效的EBM学习。