Generic Unsupervised Optimization for a Latent Variable Model With Exponential Family Observables

Latent variable models (LVMs) represent observed variables by parameterized functions of latent variables. Prominent examples of LVMs for unsupervised learning are probabilistic PCA or probabilistic SC which both assume a weighted linear summation of the latents to determine the mean of a Gaussian distribution for the observables. In many cases, however, observables do not follow a Gaussian distribution. For unsupervised learning, LVMs which assume specific non-Gaussian observables have therefore been considered. Already for specific choices of distributions, parameter optimization is challenging and only a few previous contributions considered LVMs with more generally defined observable distributions. Here, we consider LVMs that are defined for a range of different distributions, i.e., observables can follow any (regular) distribution of the exponential family. The novel class of LVMs presented is defined for binary latents, and it uses maximization in place of summation to link the latents to observables. To derive an optimization procedure, we follow an EM approach for maximum likelihood parameter estimation. We show that a set of very concise parameter update equations can be derived which feature the same functional form for all exponential family distributions. The derived generic optimization can consequently be applied to different types of metric data as well as to different types of discrete data. Also, the derived optimization equations can be combined with a recently suggested variational acceleration which is likewise generically applicable to the LVMs considered here. So, the combination maintains generic and direct applicability of the derived optimization procedure, but, crucially, enables efficient scalability. We numerically verify our analytical results and discuss some potential applications such as learning of variance structure, noise type estimation and denoising.

翻译：潜变量模型通过潜变量的参数化函数来表示观测变量。无监督学习中的典型潜变量模型包括概率主成分分析（probabilistic PCA）和概率稀疏编码（probabilistic SC），这两种模型均假设对潜变量进行加权线性求和，以确定观测变量高斯分布的均值。然而在许多情况下，观测变量并不服从高斯分布。因此，在无监督学习中，已有研究考虑了假设特定非高斯观测变量的潜变量模型。即便针对特定分布选择，参数优化也存在挑战，仅有少数先前工作涉及具有更广泛定义观测分布的潜变量模型。本文我们考虑针对一系列不同分布定义的潜变量模型，即观测变量可服从指数族的任意（正则）分布。本文提出的新型潜变量模型基于二元潜变量，并采用最大化运算替代求和运算来连接潜变量与观测变量。为推导优化流程，我们采用最大似然参数估计的期望最大化方法。研究表明，可推导出一组非常简洁的参数更新方程，这些方程对所有指数族分布均具有相同的函数形式。因此，所推导的通用优化方法可适用于不同类型度量数据及离散数据。此外，推导出的优化方程可结合近期提出的变分加速技术，该技术同样可通用应用于本文考虑的潜变量模型。这种组合保持了推导优化方法的通用性和直接适用性，且关键在于实现了高效的可扩展性。我们通过数值实验验证了理论分析结果，并讨论了方差结构学习、噪声类型估计及去噪等潜在应用场景。