We develop a new longitudinal count data regression model that accounts for zero-inflation and spatio-temporal correlation across responses. This project is motivated by an analysis of Iowa Fluoride Study (IFS) data, a longitudinal cohort study with data on caries (cavity) experience scores measured for each tooth across five time points. To that end, we use a hurdle model for zero-inflation with two parts: the presence model indicating whether a count is non-zero through logistic regression and the severity model that considers the non-zero counts through a shifted Negative Binomial distribution allowing overdispersion. To incorporate dependence across measurement occasion and teeth, these marginal models are embedded within a Gaussian copula that introduces spatio-temporal correlations. A distinct advantage of this formulation is that it allows us to determine covariate effects with population-level (marginal) interpretations in contrast to mixed model choices. Standard Bayesian sampling from such a model is infeasible, so we use approximate Bayesian computing for inference. This approach is applied to the IFS data to gain insight into the risk factors for dental caries and the correlation structure across teeth and time.
翻译:本文提出了一种新的纵向计数数据回归模型,该模型能够同时处理零膨胀现象以及响应变量间的时空相关性。本研究的动机源于对爱荷华州氟化物研究数据的分析,该纵向队列研究记录了每颗牙齿在五个时间点上的龋齿(蛀牙)经历评分。为此,我们采用跨栏模型处理零膨胀问题,该模型包含两部分:通过逻辑回归判断计数是否非零的“存在模型”,以及通过允许过度离散的平移负二项分布处理非零计数的“严重程度模型”。为了纳入测量时点间与牙齿间的依赖性,我们将这些边际模型嵌入到能够引入时空相关性的高斯Copula框架中。该构建方式的一个显著优势在于,相较于混合模型,它允许我们在总体水平(边际)上解释协变量的效应。由于从该模型进行标准贝叶斯采样不可行,我们采用近似贝叶斯计算进行推断。将此方法应用于IFS数据,有助于深入理解龋齿的风险因素以及牙齿间和跨时间的相关性结构。