R-VGAL: A Sequential Variational Bayes Algorithm for Generalised Linear Mixed Models

Models with random effects, such as generalised linear mixed models (GLMMs), are often used for analysing clustered data. Parameter inference with these models is difficult because of the presence of cluster-specific random effects, which must be integrated out when evaluating the likelihood function. Here, we propose a sequential variational Bayes algorithm, called Recursive Variational Gaussian Approximation for Latent variable models (R-VGAL), for estimating parameters in GLMMs. The R-VGAL algorithm operates on the data sequentially, requires only a single pass through the data, and can provide parameter updates as new data are collected without the need of re-processing the previous data. At each update, the R-VGAL algorithm requires the gradient and Hessian of a "partial" log-likelihood function evaluated at the new observation, which are generally not available in closed form for GLMMs. To circumvent this issue, we propose using an importance-sampling-based approach for estimating the gradient and Hessian via Fisher's and Louis' identities. We find that R-VGAL can be unstable when traversing the first few data points, but that this issue can be mitigated by using a variant of variational tempering in the initial steps of the algorithm. Through illustrations on both simulated and real datasets, we show that R-VGAL provides good approximations to the exact posterior distributions, that it can be made robust through tempering, and that it is computationally efficient.

翻译：带有随机效应的模型（如广义线性混合模型，GLMM）常用于分析聚类数据。由于存在聚类特定随机效应，在评估似然函数时必须对其进行积分，因此这类模型的参数推断较为困难。本文提出一种序贯变分贝叶斯算法——递归变分高斯近似隐变量模型（R-VGAL），用于估计GLMM中的参数。R-VGAL算法对数据进行序贯处理，仅需单次遍历数据，并能在收集新数据时更新参数，无需重新处理先前数据。在每次更新时，R-VGAL算法需计算新观测值处“部分”对数似然函数的梯度与海森矩阵，但对于GLMM而言，这些量通常无法以闭式解获得。为解决该问题，我们提出一种基于重要性采样的方法，通过Fisher恒等式和Louis恒等式估计梯度与海森矩阵。我们发现，R-VGAL在处理初始数据点时可能不稳定，但通过在算法初始步骤中采用变分退火变体可缓解此问题。基于模拟和真实数据集的实验表明，R-VGAL能良好逼近精确后验分布，通过退火策略可提升其鲁棒性，且计算效率较高。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

专知会员服务

49+阅读 · 2022年11月13日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

105+阅读 · 2022年2月10日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

52+阅读 · 2020年12月14日