Count regression models are necessary for examining discrete dependent variables alongside covariates. Nonetheless, when data display outliers, overdispersion, and an abundance of zeros, traditional methods like the zero-inflated negative binomial (ZINB) model sometimes do not yield a satisfactory fit, especially in the tail regions. This research presents a versatile, heavy-tailed discrete model as a resilient substitute for the ZINB model. The suggested framework is built by extending the generalized Pareto distribution and its zero-inflated version to the discrete domain. This formulation efficiently addresses both overdispersion and zero inflation, providing increased flexibility for heavy-tailed count data. Through intensive simulation studies and real-world implementations, the proposed models are thoroughly tested to see how well they work. The results show that our models always do better than classic negative binomial and zero-inflated negative binomial regressions when it comes to goodness-of-fit. This is especially true for datasets with a lot of zeros and outliers. These results highlight the proposed framework's potential as a strong and flexible option for modeling complicated count data.
翻译:计数回归模型对于研究离散型因变量与协变量之间的关系是必要的。然而,当数据呈现异常值、过度离散以及大量零值时,传统方法如零膨胀负二项(ZINB)模型有时无法提供令人满意的拟合效果,尤其是在尾部区域。本研究提出了一种通用的重尾离散模型,作为ZINB模型的稳健替代方案。该框架通过将广义帕累托分布及其零膨胀版本扩展至离散域而构建。此公式能有效处理过度离散和零膨胀问题,为重尾计数数据提供了更高的灵活性。通过深入的模拟研究和实际应用,对所提模型进行了全面测试以评估其性能。结果表明,在拟合优度方面,我们的模型始终优于经典的负二项回归和零膨胀负二项回归,尤其对于包含大量零值和异常值的数据集。这些发现凸显了所提框架作为建模复杂计数数据的强大且灵活选择的潜力。