Threshold selection is a fundamental problem in any threshold-based extreme value analysis. While models are asymptotically motivated, selecting an appropriate threshold for finite samples is difficult and highly subjective through standard methods. Inference for high quantiles can also be highly sensitive to the choice of threshold. Too low a threshold choice leads to bias in the fit of the extreme value model, while too high a choice leads to unnecessary additional uncertainty in the estimation of model parameters. We develop a novel methodology for automated threshold selection that directly tackles this bias-variance trade-off. We also develop a method to account for the uncertainty in the threshold estimation and propagate this uncertainty through to high quantile inference. Through a simulation study, we demonstrate the effectiveness of our method for threshold selection and subsequent extreme quantile estimation, relative to the leading existing methods, and show how the method's effectiveness is not sensitive to the tuning parameters. We apply our method to the well-known, troublesome example of the River Nidd dataset.
翻译:阈值选择是基于阈值的极值分析中的基本问题。尽管模型具有渐近理论依据,但通过标准方法为有限样本选择合适的阈值既困难又高度主观。高分位数的推断对阈值选择也可能极为敏感。阈值选择过低会导致极值模型拟合产生偏差,而选择过高则会在模型参数估计中引入不必要的额外不确定性。我们开发了一种新的自动化阈值选择方法,直接解决这种偏差-方差权衡问题。同时,我们还提出了一种方法来量化阈值估计的不确定性,并将这种不确定性传递到高分位数推断中。通过模拟研究,我们证明了该方法在阈值选择和后续极端分位数估计方面相对于现有主流方法的有效性,并表明该方法的有效性对调优参数不敏感。我们将该方法应用于著名的疑难案例——尼德河数据集。