We consider the problem of computing a Gaussian approximation to the posterior distribution of a parameter given a large number N of observations and a Gaussian prior, when the dimension of the parameter d is also large. To address this problem we build on a recently introduced recursive algorithm for variational Gaussian approximation of the posterior, called recursive variational Gaussian approximation (RVGA), which is a single pass algorithm, free of parameter tuning. In this paper, we consider the case where the parameter dimension d is high, and we propose a novel version of RVGA that scales linearly in the dimension d (as well as in the number of observations N), and which only requires linear storage capacity in d. This is afforded by the use of a novel recursive expectation maximization (EM) algorithm applied for factor analysis introduced herein, to approximate at each step the covariance matrix of the Gaussian distribution conveying the uncertainty in the parameter. The approach is successfully illustrated on the problems of high dimensional least-squares and logistic regression, and generalized to a large class of nonlinear models.
翻译:我们考虑在给定大量观测数据N和高斯先验的情况下,当参数维度d也很大时,计算后验分布的高斯近似问题。为解决该问题,我们基于最近提出的一种用于后验变分高斯近似的递归算法——递归变分高斯近似(RVGA),该算法为单次遍历算法且无需参数调优。本文针对参数维度d较高的情形,提出一种新型RVGA版本,其计算复杂度在维度d(以及观测数N)上呈线性增长,且仅需d阶线性存储容量。通过引入一种新的用于因子分析的递归期望最大化(EM)算法,该算法可在每一步近似表征参数不确定性的高斯分布协方差矩阵,从而实现了上述特性。该方法在高维最小二乘和逻辑回归问题中得到了成功验证,并可推广至一大类非线性模型。