Standard probabilistic sparse coding assumes a Laplace prior, a linear mapping from latents to observables, and Gaussian observable distributions. We here derive a solely entropy-based learning objective for the parameters of standard sparse coding. The novel variational objective has the following features: (A) unlike MAP approximations, it uses non-trivial posterior approximations for probabilistic inference; (B) unlike for previous non-trivial approximations, the novel objective is fully analytical; and (C) the objective allows for a novel principled form of annealing. The objective is derived by first showing that the standard ELBO objective converges to a sum of entropies, which matches similar recent results for generative models with Gaussian priors. The conditions under which the ELBO becomes equal to entropies are then shown to have analytical solutions, which leads to the fully analytical objective. Numerical experiments are used to demonstrate the feasibility of learning with such entropy-based ELBOs. We investigate different posterior approximations including Gaussians with correlated latents and deep amortized approximations. Furthermore, we numerically investigate entropy-based annealing which results in improved learning. Our main contributions are theoretical, however, and they are twofold: (1) for non-trivial posterior approximations, we provide the (to the knowledge of the authors) first analytical ELBO objective for standard probabilistic sparse coding; and (2) we provide the first demonstration on how a recently shown convergence of the ELBO to entropy sums can be used for learning.
翻译:标准概率稀疏编码假设拉普拉斯先验、从潜变量到观测变量的线性映射以及高斯观测分布。本文推导出仅基于熵的标准稀疏编码参数学习目标。这一新型变分目标具有以下特点:(A) 与MAP近似不同,它使用非平凡后验近似进行概率推理;(B) 与以往非平凡近似不同,该新型目标完全具有解析形式;(C) 该目标支持一种新颖且规范形式的退火方法。该目标首先通过证明标准ELBO目标收敛于熵之和推导得出,这与近期关于高斯先验生成模型的结果一致。随后,ELBO等于熵的条件被证明具有解析解,从而得到完全解析的目标函数。数值实验验证了基于熵的ELBO进行学习的可行性。我们研究了不同的后验近似方法,包括具有相关潜变量的高斯近似和深度摊销近似。此外,我们通过数值实验探索了基于熵的退火方法,该方法能够改进学习效果。然而,本文的主要贡献在于理论层面,具体体现在两方面:(1) 针对非平凡后验近似,我们提供了(据作者所知)首个标准概率稀疏编码的解析ELBO目标函数;(2) 我们首次展示了最近发现的ELBO收敛于熵之和的机制如何用于学习。