This study introduces an outlier-robust model for analyzing hierarchically structured bounded count data within a Bayesian framework, utilizing a logistic regression approach implemented in JAGS. Our model incorporates a t-distributed latent variable to address overdispersion and outliers, improving robustness compared to conventional models such as the beta-binomial, binomial-logit-normal, and standard binomial models. Notably, our model targets a pseudo-median that differs from the true discrete median by less than one count; this closed-form quantity provides a robust and interpretable measure of central tendency. For comparability between all models, we additionally make predictions based on the mean proportion; however, this involves an integration step for the t-distributed nuisance parameter. While limited literature specifically addresses outliers in mixed models for bounded count data, this research fills that gap. The practical utility of the model is demonstrated using a longitudinal medication adherence dataset, where patient behavior often results in abrupt changes and outliers within individual trajectories. A simulation study demonstrates the binomial-logit-t model's strong performance, with comparison statistics favoring it among the four evaluated models. An additional data contamination simulation confirms its robustness against outliers. Our robust approach maintains the integrity of the dataset, effectively handling outliers to provide more accurate and reliable parameter estimates.
翻译:本研究提出了一种在贝叶斯框架下分析分层结构有界计数数据的异常值稳健模型,该模型采用在JAGS中实现的逻辑回归方法。我们的模型引入了t分布潜变量来处理过度离散和异常值,相比传统模型(如beta-二项分布、二项-对数正态分布及标准二项模型)具有更强的稳健性。值得注意的是,该模型以伪中位数为目标,其与真实离散中位数的差异小于一个计数单位;这个闭式量提供了稳健且可解释的集中趋势度量。为保障所有模型间的可比性,我们还基于平均比例进行预测,但这涉及对t分布冗余参数的积分步骤。尽管现有文献专门针对有界计数数据混合模型中的异常值研究有限,但本研究填补了这一空白。通过纵向用药依从性数据集验证了该模型的实际效用,其中患者行为常导致个体轨迹的突变和异常值。模拟研究表明二项-对数-t模型表现优异,比较统计量在四个评估模型中均显示其优势。额外的数据污染模拟实验证实了其对异常值的稳健性。我们的稳健方法在保持数据集完整性的同时,能有效处理异常值,从而提供更准确可靠的参数估计。