Threshold selection is a fundamental problem in any threshold-based extreme value analysis. While models are asymptotically motivated, selecting an appropriate threshold for finite samples is difficult and highly subjective through standard methods. Inference for high quantiles can also be highly sensitive to the choice of threshold. Too low a threshold choice leads to bias in the fit of the extreme value model, while too high a choice leads to unnecessary additional uncertainty in the estimation of model parameters. We develop a novel methodology for automated threshold selection that directly tackles this bias-variance trade-off. We also develop a method to account for the uncertainty in the threshold estimation and propagate this uncertainty through to high quantile inference. Through a simulation study, we demonstrate the effectiveness of our method for threshold selection and subsequent extreme quantile estimation, relative to the leading existing methods, and show how the method's effectiveness is not sensitive to the tuning parameters. We apply our method to the well-known, troublesome example of the River Nidd dataset.
翻译:阈值选择是基于阈值的极值分析中的一个基本问题。尽管模型具有渐近理论支持,但在有限样本下选择合适阈值仍十分困难,且通过标准方法具有高度主观性。高分位数的推断也可能对阈值选择高度敏感:阈值过低会导致极值模型拟合产生偏差,而阈值过高则会带来模型参数估计中不必要的额外不确定性。我们提出了一种应对这一偏差-方差权衡的新型自动化阈值选择方法。同时,我们还开发了一种方法,用于量化阈值估计中的不确定性,并将这种不确定性传播至高分位数推断。通过模拟研究,我们证明了本方法在阈值选择及后续极端分位数估计中的有效性,优于现有主流方法,并且展示了本方法的有效性对调优参数不敏感。我们将该方法应用于著名的棘手案例——尼德河数据集。