Quantifying the likelihood of extreme events underpins risk assessment, yet classical Extreme Value Theory relies on asymptotic assumptions that fail in the data-sparse, non-stationary regimes practitioners increasingly face. We introduce the Data-Driven Extreme Value Distribution (DDEVD), a non-parametric estimator that aggregates all observations metastatistically and reconstructs the base distribution with a kernel, removing parametric tail assumptions. We derive its optimal bandwidth and prove a stability law $m < C\,n^{1+γ/2}$ relating reliable extrapolation to the extreme value index $γ$. In sub-hourly Alpine precipitation, DDEVD recovers stable 100-year return levels from single decades (calibration ratio $0.96$), departing from the full-record reference by over $50\,\%$ in fewer than one window in fifty -- versus one in five for a GEV fit. In metallurgical micrographs, it matches a generalised extreme-value fit on the safety-relevant grain-size tail, where the standard log-normal over-predicts by $58\,\%$ at $1\,\mathrm{cm}^{2}$.
翻译:量化极端事件的可能性是风险评估的基础,然而经典极值理论依赖于渐近假设,这在实际工作日益面临的稀疏数据、非平稳场景中失效。我们引入数据驱动极值分布(DDEVD),这是一种非参数估计器,它通过元统计分析汇聚所有观测,并利用核函数重建基础分布,从而移除参数化尾部假设。我们推导出其最优带宽,并证明一个稳定性定律 $m < C\,n^{1+γ/2}$,该定律将可靠外推与极值指数 $γ$ 相关联。在对亚小时尺度高山降水的分析中,DDEVD 能基于单十年数据恢复稳定的百年重现水平(校准比 $0.96$),在不到五十分之一的窗口内偏离全记录参考值超过 $50\,\%$——相比之下,GEV 拟合的这一比例为五分之一。在金相显微图像中,DDEVD 在安全相关的晶粒尺寸尾部匹配了广义极值拟合,而标准对数正态分布在 $1\,\mathrm{cm}^{2}$ 处高估了 $58\,\%$。