The statistical modeling of discrete extremes has received less attention than their continuous counterparts in the Extreme Value Theory (EVT) literature. One approach to the transition from continuous to discrete extremes is the modeling of threshold exceedances of integer random variables by the discrete version of the generalized Pareto distribution. However, the optimal choice of thresholds defining exceedances remains a problematic issue. Moreover, in a regression framework, the treatment of the majority of non-extreme data below the selected threshold is either ignored or separated from the extremes. To tackle these issues, we expand on the concept of employing a smooth transition between the bulk and the upper tail of the distribution. In the case of zero inflation, we also develop models with an additional parameter. To incorporate possible predictors, we relate the parameters to additive smoothed predictors via an appropriate link, as in the generalized additive model (GAM) framework. A penalized maximum likelihood estimation procedure is implemented. We illustrate our modeling proposal with a real dataset of avalanche activity in the French Alps. With the advantage of bypassing the threshold selection step, our results indicate that the proposed models are more flexible and robust than competing models, such as the negative binomial distribution
翻译:在极值理论文献中,离散极值的统计建模相较于连续极值情形受到的关注较少。从连续极值过渡到离散极值的一种方法,是利用广义帕累托分布的离散版本对整数型随机变量的阈值超限进行建模。然而,定义超限事件的最优阈值选择仍然是一个难题。此外,在回归框架中,所选阈值以下的大量非极值数据的处理往往被忽略或与极值部分割裂。为解决这些问题,我们扩展了在分布的主体部分与上尾之间建立平滑过渡的概念。针对零膨胀情形,我们还开发了包含额外参数的模型。为纳入可能的预测变量,我们通过适当的链接函数将参数与加性平滑预测变量相关联,正如广义可加模型框架中的做法。我们实现了惩罚最大似然估计程序。我们以法国阿尔卑斯山区雪崩活动的实际数据集为例,展示了所提出的建模方案。由于能够绕过阈值选择步骤,我们的结果表明,所提出的模型比负二项分布等竞争模型更具灵活性和稳健性。