Threshold selection is a fundamental problem in any threshold-based extreme value analysis. While models are asymptotically motivated, selecting an appropriate threshold for finite samples can be difficult through standard methods. Inference can also be highly sensitive to the choice of threshold. Too low a threshold choice leads to bias in the fit of the extreme value model, while too high a choice leads to unnecessary additional uncertainty in the estimation of model parameters. In this paper, we develop a novel methodology for automated threshold selection that directly tackles this bias-variance trade-off. We also develop a method to account for the uncertainty in this threshold choice and propagate this uncertainty through to high quantile inference. Through a simulation study, we demonstrate the effectiveness of our method for threshold selection and subsequent extreme quantile estimation. We apply our method to the well-known, troublesome example of the River Nidd dataset.
翻译:阈值选择是基于阈值的极值分析中的一个基本问题。尽管模型在渐近意义上具有理论依据,但通过标准方法为有限样本选择合适的阈值可能十分困难,并且推断结果对阈值选取高度敏感。阈值选择过低会导致极值模型拟合产生偏差,而阈值选择过高则会在模型参数估计中引入不必要的额外不确定性。本文提出了一种新颖的自动化阈值选择方法,直接处理了这种偏差-方差的权衡问题。我们还开发了一种方法来考虑阈值选择中的不确定性,并将这种不确定性传播到高分位数推断中。通过模拟研究,我们证明了该方法在阈值选择及后续极值分位数估计方面的有效性。我们将该方法应用于著名的、具有挑战性的尼德河数据集案例。