Extreme quantiles are critical for understanding the behavior of data in the tail region of a distribution. It is challenging to estimate extreme quantiles, particularly when dealing with limited data in the tail. In such cases, extreme value theory offers a solution by approximating the tail distribution using the Generalized Pareto Distribution (GPD). This allows for the extrapolation beyond the range of observed data, making it a valuable tool for various applications. However, when it comes to conditional cases, where estimation relies on covariates, existing methods may require computationally expensive GPD fitting for different observations. This computational burden becomes even more problematic as the volume of observations increases, sometimes approaching infinity. To address this issue, we propose an interpolation-based algorithm named EMI. EMI facilitates the online prediction of extreme conditional quantiles with finite offline observations. Combining quantile regression and GPD-based extrapolation, EMI formulates as a bilevel programming problem, efficiently solvable using classic optimization methods. Once estimates for offline observations are obtained, EMI employs B-spline interpolation for covariate-dependent variables, enabling estimation for online observations with finite GPD fitting. Simulations and real data analysis demonstrate the effectiveness of EMI across various scenarios.
翻译:极端分位数对于理解数据在分布尾部的行为至关重要。由于尾部数据有限,估计极端分位数颇具挑战性。此时,极值理论通过广义帕累托分布逼近尾部分布提供解决方案,从而能够在观测数据范围之外进行外推,成为多种应用中的宝贵工具。然而,在依赖协变量的条件估计场景下,现有方法可能需要对不同观测进行高计算成本的广义帕累托分布拟合。随着观测数据量持续增加(有时趋于无穷),计算负担问题愈发突出。为此,我们提出一种名为EMI的插值算法,通过有限离线观测实现极端条件分位数的在线预测。该算法结合分位数回归与基于广义帕累托分布的外推方法,将问题建模为双层规划模型,可利用经典优化方法高效求解。在获得离线观测估计值后,EMI采用B样条插值处理协变量相关参数,通过有限次广义帕累托分布拟合即可实现对在线观测的估计。模拟实验与真实数据分析验证了EMI在多种场景下的有效性。