The generalized Pareto distribution (GPD) is a fundamental model for analyzing the tail behavior of a distribution. In particular, the shape parameter of the GPD characterizes the extremal properties of the distribution. As described in this paper, we propose a method for grouping shape parameters in the GPD for clustered data via graph fused lasso. The proposed method simultaneously estimates the model parameters and identifies which clusters can be grouped together. We establish the asymptotic theory of the proposed estimator and demonstrate that its variance is lower than that of the cluster-wise estimator. This variance reduction not only enhances estimation stability but also provides a principled basis for identifying homogeneity and heterogeneity among clusters in terms of their tail behavior. We assess the performance of the proposed estimator through Monte Carlo simulations. As an illustrative example, our method is applied to rainfall data from 996 clustered sites across Japan.
翻译:广义帕累托分布(GPD)是分析分布尾部行为的基础模型,其形状参数尤其决定了分布的极端特性。本文提出一种基于图融合套索的方法,用于对聚类数据中GPD的形状参数进行分组。该方法能够同时估计模型参数并识别哪些聚类可以归为一组。我们建立了所提出估计量的渐近理论,证明其方差低于逐聚类估计量。这种方差降低不仅提升了估计稳定性,还为从尾部行为角度识别聚类间的同质性与异质性提供了理论依据。通过蒙特卡洛模拟评估了所提出估计量的性能,并以日本996个聚类站点的降雨数据为例进行了方法演示。