Natural gradients can improve convergence in stochastic variational inference significantly but inverting the Fisher information matrix is daunting in high dimensions. Moreover, in Gaussian variational approximation, natural gradient updates of the precision matrix do not ensure positive definiteness. To tackle this issue, we derive analytic natural gradient updates of the Cholesky factor of the covariance or precision matrix, and consider sparsity constraints representing different posterior correlation structures. Stochastic normalized natural gradient ascent with momentum is proposed for implementation in generalized linear mixed models and deep neural networks.
翻译:自然梯度能够显著提升随机变分推断的收敛速度,但在高维场景下,费希尔信息矩阵的求逆计算极为困难。此外,在高斯变分近似中,精度矩阵的自然梯度更新无法保证其正定性。为解决该问题,我们推导了协方差矩阵或精度矩阵Cholesky因子的解析自然梯度更新,并考虑了表征不同后验相关结构的稀疏性约束。针对广义线性混合模型与深度神经网络的实际应用,我们提出了一种结合动量的随机归一化自然梯度上升算法。