The Tweedie generalized linear models are commonly applied in the insurance industry to analyze semicontinuous claim data. For better prediction of the aggregated claim size, the mean and dispersion of the Tweedie model are often estimated together using the double generalized linear models. In some actuarial applications, it is common to observe an excessive percentage of zeros, which often results in a decline in the performance of the Tweedie model. The zero-inflated Tweedie model has been recently considered in the literature, which draws inspiration from the zero-inflated Poisson model. In this article, we consider the problem of dispersion modeling of the Tweedie state in the zero-inflated Tweedie model, in addition to the mean modeling. We also model the probability of the zero state based on the generalized expectation-maximization algorithm. To potentially incorporate nonlinear and interaction effects of the covariates, we estimate the mean, dispersion, and zero-state probability using decision-tree-based gradient boosting. We conduct extensive numerical studies to demonstrate the improved performance of our method over existing ones.
翻译:Tweedie广义线性模型在保险行业中常用于分析半连续型索赔数据。为更好地预测总索赔规模,通常采用双广义线性模型同时估计Tweedie模型的均值与离散度参数。在某些精算应用中,常会观测到过高的零值比例,这往往导致Tweedie模型性能下降。受零膨胀泊松模型启发,学界近期提出了零膨胀Tweedie模型。本文除均值建模外,进一步研究零膨胀Tweedie模型中Tweedie状态离散度的建模问题,并基于广义期望最大化算法对零状态概率进行建模。为潜在纳入协变量的非线性效应与交互效应,我们采用基于决策树的梯度提升方法对均值、离散度及零状态概率进行联合估计。通过大量数值实验,验证了所提方法相较于现有方法的性能提升。