We provide sparse principal loading analysis which is a new concept that reduces dimensionality of cross sectional data and identifies the underlying covariance structure. Sparse principal loading analysis selects a subset of existing variables for dimensionality reduction while variables that have a small distorting effect on the covariance matrix are discarded. Therefore, we show how to detect these variables and provide methods to assess their magnitude of distortion. Sparse principal loading analysis is twofold and can also identify the underlying block diagonal covariance structure using sparse loadings. This is a new approach in this context and we provide a required criterion to evaluate if the found block-structure fits the sample. The method uses sparse loadings rather than eigenvectors to decompose the covariance matrix which can result in a large loss of information if the loadings of choice are too sparse. However, we show that this is no concern in our new concept because sparseness is controlled by the aforementioned evaluation criterion. Further, we show the advantages of sparse principal loading analysis both in the context of variable selection and covariance structure detection, and illustrate the performance of the method with simulations and on real datasets. Supplementary material for this article is available online.
翻译:我们提出稀疏主载荷分析这一新概念,该方法可降低截面数据维度并识别其潜在协方差结构。稀疏主载荷分析通过选择已有变量的子集实现降维,同时剔除对协方差矩阵具有微小扭曲效应的变量。因此,我们展示了如何检测这些变量,并提供评估其扭曲程度的方法。稀疏主载荷分析具有双重功能:既能通过稀疏载荷识别潜在的块对角协方差结构,这在该领域尚属全新方法,又提出了评估所发现块结构是否拟合样本的必要准则。该方法采用稀疏载荷而非特征向量分解协方差矩阵,若所选载荷过于稀疏可能导致大量信息损失。然而,我们证明在新概念中无需担忧此问题,因为稀疏性受前述评估准则的约束。我们进一步展示了稀疏主载荷分析在变量选择与协方差结构检测两方面的优势,并通过模拟实验及真实数据集验证了该方法的性能。本文补充材料可在线获取。