Efficient Training of Energy-Based Models Using Jarzynski Equality

Energy-based models (EBMs) are generative models inspired by statistical physics with a wide range of applications in unsupervised learning. Their performance is best measured by the cross-entropy (CE) of the model distribution relative to the data distribution. Using the CE as the objective for training is however challenging because the computation of its gradient with respect to the model parameters requires sampling the model distribution. Here we show how results for nonequilibrium thermodynamics based on Jarzynski equality together with tools from sequential Monte-Carlo sampling can be used to perform this computation efficiently and avoid the uncontrolled approximations made using the standard contrastive divergence algorithm. Specifically, we introduce a modification of the unadjusted Langevin algorithm (ULA) in which each walker acquires a weight that enables the estimation of the gradient of the cross-entropy at any step during GD, thereby bypassing sampling biases induced by slow mixing of ULA. We illustrate these results with numerical experiments on Gaussian mixture distributions as well as the MNIST dataset. We show that the proposed approach outperforms methods based on the contrastive divergence algorithm in all the considered situations.

翻译：能量模型（EBMs）是一类受统计物理学启发的生成模型，在无监督学习领域具有广泛的应用。其性能的最佳度量标准是模型分布相对于数据分布的交叉熵（CE）。然而，将CE作为训练目标具有挑战性，因为计算其对模型参数的梯度需要对模型分布进行采样。本文展示了如何利用基于Jarzynski等式的非平衡热力学结果，结合序贯蒙特卡洛采样工具，高效地执行这一计算，并避免标准对比散度算法中采用的不受控近似。具体而言，我们提出了一种对未调整Langevin算法（ULA）的改进，其中每个游走者获得一个权重，使得能够在梯度下降的任何步骤中估计交叉熵的梯度，从而绕过了由ULA缓慢混合引起的采样偏差。我们通过高斯混合分布以及MNIST数据集的数值实验验证了这些结果。结果表明，在所有考虑的情况下，所提出的方法均优于基于对比散度算法的方法。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日