Count data with complex features arise in many disciplines, including ecology, agriculture, criminology, medicine, and public health. Zero inflation, spatial dependence, and non-equidispersion are common features in count data. There are two classes of models that allow for these features -- he mode-parameterized Conway--Maxwell--Poisson (COMP) distribution and the generalized Poisson model. However both require the use of either constraints on the parameter space or a parameterization that leads to challenges in interpretability. We propose a spatial mean-parameterized COMP model that retains the flexibility of these models while resolving the above issues. We use a Bayesian spatial filtering approach in order to efficiently handle high-dimensional spatial data and we use reversible-jump MCMC to automatically choose the basis vectors for spatial filtering. The COMP distribution poses two additional computational challenges -- an intractable normalizing function in the likelihood and no closed-form expression for the mean. We propose a fast computational approach that addresses these challenges by, respectively, introducing an efficient auxiliary variable algorithm and pre-computing key approximations for fast likelihood evaluation. We illustrate the application of our methodology to simulated and real datasets, including Texas HPV-cancer data and US vaccine refusal data.
翻译:具有复杂特征的计数数据出现在生态学、农业、犯罪学、医学和公共卫生等多个学科中。零膨胀、空间依赖性和非等离散性是计数数据的常见特征。有两类模型可以处理这些特征——模式参数化的Conway-Maxwell-Poisson(COMP)分布和广义泊松模型。然而,这两类模型都需要对参数空间施加约束或采用参数化方法,这会导致可解释性方面的挑战。我们提出了一种空间均值参数化COMP模型,该模型在保留这些模型灵活性的同时解决了上述问题。我们采用贝叶斯空间滤波方法来高效处理高维空间数据,并使用可逆跳跃MCMC自动选择空间滤波的基向量。COMP分布带来了两个额外的计算挑战——似然函数中不可计算的归一化函数以及均值没有闭合表达式。我们提出了一种快速计算方法,通过分别引入高效的辅助变量算法和预先计算关键近似值以实现快速似然评估来解决这些挑战。我们通过模拟数据集和真实数据集(包括德克萨斯州HPV癌症数据和美国疫苗拒绝数据)展示了我们方法的应用。