Principal component analysis (PCA) is one of the most popular dimension reduction techniques in statistics and is especially powerful when a multivariate distribution is concentrated near a lower-dimensional subspace. Multivariate extreme value distributions have turned out to provide challenges for the application of PCA since their constraint support impedes the detection of lower-dimensional structures and heavy-tails can imply that second moments do not exist, thereby preventing the application of classical variance-based techniques for PCA. We adapt PCA to max-stable distributions using a regression setting and employ max-linear maps to project the random vector to a lower-dimensional space while preserving max-stability. We also provide a characterization of those distributions which allow for a perfect reconstruction from the lower-dimensional representation. Finally, we demonstrate how an optimal projection matrix can be consistently estimated and show viability in practice with a simulation study and application to a benchmark dataset.
翻译:主成分分析(PCA)是统计学中最受欢迎的降维技术之一,当多元分布集中于低维子空间附近时尤其有效。多元极值分布对PCA的应用提出了挑战,因为其约束支撑阻碍了低维结构的检测,且重尾特性可能导致二阶矩不存在,从而阻碍了基于经典方差的PCA技术的应用。我们通过回归设置将PCA适配于极大稳定分布,并采用极大线性映射将随机向量投影到低维空间,同时保持极大稳定性。我们还刻画了允许从低维表示完美重构的分布特征。最后,我们论证了最优投影矩阵如何能被一致估计,并通过模拟研究和基准数据集的应用展示了其实际可行性。