Motivated by the analysis of the behaviour of extremes from multivariate heavy-tailed distributions, we introduce a novel notion of statistical depth, referred to as Polar Depth. The polar depth function is naturally expressed in polar coordinates, as is the limiting distribution of a regularly varying random variable, beyond asymptotically large thresholds, once its marginals have been appropriately normalized. Not only does the polar depth function make it easy to order the extreme values taken by a heavy-tailed random variable X and finds natural applications in anomaly detection, but it is also possible to show, as we prove it under appropriate assumptions in this article, that the polar depth of the largest observations, i.e. observations X which norm is larger than t>0, converges to the polar depth of the limiting distribution as t converges to infinity. Although designed to quantify the depth of multivariate extremes, the polar depth is interesting in its own right, insofar as this notion is more relevant for distributions whose support is included in a halfspace than the alternatives proposed in the literature, the halfspace depth in particular. Here, we demonstrate its properties and analyze statistical issues related to its estimation from both finite-sample and asymptotic points of view. We present numerical results to empirically demonstrate its relevance, particularly for the statistical analysis of extreme observations and more specifically for the identification of anomalies among them.
翻译:受多变量重尾分布极值行为分析的启发,本文提出一种新的统计深度概念——极坐标深度。该深度函数自然采用极坐标形式表达,正如正则变化随机变量的极限分布在超过渐近大阈值且边际分布经适当标准化后的表达方式。极坐标深度函数不仅便于对重尾随机变量X的极端值进行排序,并在异常检测中具有天然应用价值,更重要的是——在本文适当假设条件下证明——当阈值t趋于无穷大时,最大观测值(即满足范数大于t的观测值X)的极坐标深度将收敛于极限分布的极坐标深度。尽管极坐标深度专为量化多变量极值而设计,但它本身具有独立研究价值:对于支撑集包含于半空间的分布而言,该概念比文献中提出的替代方案(特别是半空间深度)更具相关性。本文论证了其性质,并从有限样本和渐近两个角度分析了与其估计相关的统计问题,通过数值结果实证展示了其应用价值,特别是在极端观测值的统计分析及其异常识别中的有效性。