Beta regression is used routinely for continuous proportional data, but it often encounters practical issues such as a lack of robustness to misspecification of the beta distribution and sensitivity to outliers. We develop an improved class of generalized linear models starting with the continuous binomial (cobin) distribution and further extending to dispersion mixtures of cobin distributions (micobin). The proposed cobin regression and micobin regression models have attractive robustness, computation, and flexibility properties. A key innovation is the Kolmogorov-Gamma data augmentation scheme, which facilitates Gibbs sampling for Bayesian computation, including in hierarchical cases involving nested, longitudinal, or spatial data. We demonstrate robustness, ability to handle responses exactly at the boundary (0 or 1), and computational efficiency relative to beta regression in simulation experiments and through analysis of the benthic macroinvertebrate multimetric index of US lakes using lake watershed covariates.
翻译:Beta回归常用于连续比例数据,但其常面临实际困难,如对Beta分布误设缺乏稳健性及对异常值的敏感性。本文从连续二项分布出发,构建了一类改进的广义线性模型,并进一步扩展至连续二项分布的离散混合分布。所提出的连续二项回归与混合连续二项回归模型具有优异的稳健性、计算效率与灵活性。核心创新在于Kolmogorov-Gamma数据增强方案,该方案为贝叶斯计算(包括涉及嵌套、纵向或空间数据的层次模型)提供了高效的Gibbs采样框架。通过模拟实验及基于美国湖泊流域协变量的底栖大型无脊椎动物多指标指数分析,我们验证了模型相对于Beta回归的稳健性、处理边界响应值(0或1)的能力以及计算效率优势。