We provide sparse principal loading analysis which is a new concept that reduces dimensionality of cross sectional data and identifies the underlying covariance structure. Sparse principal loading analysis selects a subset of existing variables for dimensionality reduction while variables that have a small distorting effect on the covariance matrix are discarded. Therefore, we show how to detect these variables and provide methods to assess their magnitude of distortion. Sparse principal loading analysis is twofold and can also identify the underlying block diagonal covariance structure using sparse loadings. This is a new approach in this context and we provide a required criterion to evaluate if the found block-structure fits the sample. The method uses sparse loadings rather than eigenvectors to decompose the covariance matrix which can result in a large loss of information if the loadings of choice are too sparse. However, we show that this is no concern in our new concept because sparseness is controlled by the aforementioned evaluation criterion. Further, we show the advantages of sparse principal loading analysis both in the context of variable selection and covariance structure detection, and illustrate the performance of the method with simulations and on real datasets. Supplementary material for this article is available online.
翻译:我们提出了稀疏主载荷分析这一新概念,用于降低横截面数据的维度并识别潜在协方差结构。稀疏主载荷分析通过选取现有变量的子集实现降维,同时剔除对协方差矩阵产生微小扭曲效应的变量。因此,我们展示了如何检测这些变量,并提供了评估其扭曲程度的方法。稀疏主载荷分析具有双重功能,还能利用稀疏载荷识别潜在的块对角协方差结构。这是该领域的新方法,我们提出了评估所找到的块结构是否拟合样本的必要准则。该方法使用稀疏载荷而非特征向量分解协方差矩阵——若所选载荷过于稀疏,可能导致大量信息损失。然而,我们证明这一担忧在新概念中并不存在,因为稀疏性受上述评估准则控制。此外,我们展示了稀疏主载荷分析在变量选择与协方差结构检测两方面的优势,并通过仿真与真实数据集验证了该方法的表现。本文的补充材料可在网上获取。