Hierarchical Bayesian Poisson regression models (HBPRMs) provide a flexible modeling approach of the relationship between predictors and count response variables. The applications of HBPRMs to large-scale datasets require efficient inference algorithms due to the high computational cost of inferring many model parameters based on random sampling. Although Markov Chain Monte Carlo (MCMC) algorithms have been widely used for Bayesian inference, sampling using this class of algorithms is time-consuming for applications with large-scale data and time-sensitive decision-making, partially due to the non-conjugacy of many models. To overcome this limitation, this research develops an approximate Gibbs sampler (AGS) to efficiently learn the HBPRMs while maintaining the inference accuracy. In the proposed sampler, the data likelihood is approximated with Gaussian distribution such that the conditional posterior of the coefficients has a closed-form solution. Numerical experiments using real and synthetic datasets with small and large counts demonstrate the superior performance of AGS in comparison to the state-of-the-art sampling algorithm, especially for large datasets.
翻译:分层贝叶斯泊松回归模型为预测变量与计数响应变量之间的关系提供了一种灵活的建模方法。由于基于随机抽样推断众多模型参数的计算成本高昂,将HBPRM应用于大规模数据集需要高效的推断算法。尽管马尔可夫链蒙特卡罗算法已广泛用于贝叶斯推断,但对于大规模数据应用和时效性决策而言,使用此类算法进行采样仍非常耗时,部分原因在于许多模型的非共轭性。为克服这一局限,本研究开发了一种近似吉布斯采样器,在保持推断精度的同时高效学习HBPRM。在所提出的采样器中,数据似然通过高斯分布进行近似,使得系数的条件后验分布具有闭式解。使用真实与合成数据集(包含小计数与大计数)进行的数值实验表明,相较于最先进的采样算法,AGS尤其在大规模数据集上展现出优越性能。