On the Convergence of the ELBO to Entropy Sums

The variational lower bound (a.k.a. ELBO or free energy) is the central objective for many established as well as for many novel algorithms for unsupervised learning. Such algorithms usually increase the bound until parameters have converged to values close to a stationary point of the learning dynamics. Here we show that (for a very large class of generative models) the variational lower bound is at all stationary points of learning equal to a sum of entropies. Concretely, for standard generative models with one set of latents and one set of observed variables, the sum consists of three entropies: (A) the (average) entropy of the variational distributions, (B) the negative entropy of the model's prior distribution, and (C) the (expected) negative entropy of the observable distribution. The obtained result applies under realistic conditions including: finite numbers of data points, at any stationary point (including saddle points) and for any family of (well behaved) variational distributions. The class of generative models for which we show the equality to entropy sums contains many standard as well as novel generative models including standard (Gaussian) variational autoencoders. The prerequisites we use to show equality to entropy sums are relatively mild. Concretely, the distributions defining a given generative model have to be of the exponential family, and the model has to satisfy a parameterization criterion (which is usually fulfilled). Proving equality of the ELBO to entropy sums at stationary points (under the stated conditions) is the main contribution of this work.

翻译：变分下界（亦称ELBO或自由能）是许多经典及新兴无监督学习算法的核心目标。此类算法通常持续提升该界值，直至参数收敛至学习动态平稳点附近的值。本文证明（对于一大类生成模型）变分下界在所有学习平稳点均等于一组熵之和。具体而言，对于具有一组隐变量和一组观测变量的标准生成模型，该和包含三个熵分量：（A）变分分布的（平均）熵，（B）模型先验分布的负熵，以及（C）可观测分布的（期望）负熵。所得结果适用于现实条件，包括：有限数据点数量、任意平稳点（含鞍点）以及任意（性质良好的）变分分布族。我们证明其满足熵和等式的生成模型类别包含众多经典及新兴生成模型，包括标准（高斯）变分自编码器。证明熵和等式所需的前提条件相对温和：具体而言，定义生成模型的分布需属于指数族，且模型需满足参数化准则（该准则通常成立）。在所述条件下证明平稳点处ELBO与熵和的等价性是本工作的主要贡献。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日