Count data with an excessive number of zeros frequently arise in fields such as economics, medicine, and public health. Traditional count models often fail to adequately handle such data, especially when the relationship between the response and some predictors is nonlinear. To overcome these limitations, the partially linear zero-inflated Poisson (PLZIP) model has been proposed as a flexible alternative. However, all existing estimation approaches for this model are based on likelihood, which is known to be highly sensitive to outliers and slight deviations from the model assumptions. This article presents the first robust estimation method specifically developed for the PLZIP model. An Expectation-Maximization-like algorithm is used to take advantage of the mixture nature of the model and to address extreme observations in both the response and the covariates. Results of the algorithm convergence and the consistency of the estimators are proved. A simulation study under various contamination schemes showed the robustness and efficiency of the proposed estimators in finite samples, compared to classical estimators. Finally, the application of the methodology is illustrated through an example using real data.
翻译:在经济学、医学和公共卫生等领域,经常出现零值过多的计数数据。传统计数模型往往难以充分处理此类数据,尤其当响应变量与某些预测变量之间存在非线性关系时。为克服这些局限,部分线性零膨胀泊松(PLZIP)模型作为一种灵活的替代方案被提出。然而,现有该模型的所有估计方法均基于似然原理,而众所周知,似然估计对异常值及模型假设的轻微偏离极为敏感。本文首次提出了专门针对PLZIP模型的稳健估计方法。通过采用类期望最大化算法,充分利用模型的混合特性,并处理响应变量与协变量中的极端观测值。我们证明了算法的收敛性及估计量的一致性。在不同污染方案下的模拟研究表明,与经典估计量相比,所提估计量在有限样本中具有优异的稳健性与有效性。最后,通过实际数据案例展示了该方法的应用。