Count data play a critical role in medical research, such as heart disease. The Poisson regression model is a common technique for evaluating the impact of a set of covariates on the count responses. The mixture of Poisson regression models with experts is a practical tool to exploit the covariates, not only to handle the heterogeneity in the Poisson regressions but also to learn the mixing structure of the population. Multicollinearity is one of the most common challenges with regression models, leading to ill-conditioned design matrices of Poisson regression components and expert classes. The maximum likelihood method produces unreliable and misleading estimates for the effects of the covariates in multicollinearity. In this research, we develop Ridge and Liu-type methods as two shrinkage approaches to cope with the ill-conditioned design matrices of the mixture of Poisson regression models with experts. Through various numerical studies, we demonstrate that the shrinkage methods offer more reliable estimates for the coefficients of the mixture model in multicollinearity while maintaining the classification performance of the ML method. The shrinkage methods are finally applied to a heart study to analyze the heart disease rate stages.
翻译:计数数据在医学研究(如心脏病)中扮演着关键角色。泊松回归模型是评估一组协变量对计数响应影响的常用技术。具有专家机制的泊松回归混合模型是一种实用工具,不仅可利用协变量处理泊松回归中的异质性,还能学习总体的混合结构。多重共线性是回归模型中最常见的挑战之一,它会导致泊松回归分量和专家类别的设计矩阵出现病态。最大似然法在多重共线性情况下会产生不可靠且具有误导性的协变量效应估计。在本研究中,我们开发了岭回归和Liu型方法这两种收缩方法,以处理具有专家机制的泊松回归混合模型的病态设计矩阵。通过多项数值研究,我们证明收缩方法在多重共线性条件下能为混合模型的系数提供更可靠的估计,同时保持ML方法的分类性能。这些收缩方法最终被应用于一项心脏研究,以分析心脏病分期的发生率。