A New Basis for Sparse Principal Component Analysis

Previous versions of sparse principal component analysis (PCA) have presumed that the eigen-basis (a $p \times k$ matrix) is approximately sparse. We propose a method that presumes the $p \times k$ matrix becomes approximately sparse after a $k \times k$ rotation. The simplest version of the algorithm initializes with the leading $k$ principal components. Then, the principal components are rotated with an $k \times k$ orthogonal rotation to make them approximately sparse. Finally, soft-thresholding is applied to the rotated principal components. This approach differs from prior approaches because it uses an orthogonal rotation to approximate a sparse basis. One consequence is that a sparse component need not to be a leading eigenvector, but rather a mixture of them. In this way, we propose a new (rotated) basis for sparse PCA. In addition, our approach avoids "deflation" and multiple tuning parameters required for that. Our sparse PCA framework is versatile; for example, it extends naturally to a two-way analysis of a data matrix for simultaneous dimensionality reduction of rows and columns. We provide evidence showing that for the same level of sparsity, the proposed sparse PCA method is more stable and can explain more variance compared to alternative methods. Through three applications -- sparse coding of images, analysis of transcriptome sequencing data, and large-scale clustering of social networks, we demonstrate the modern usefulness of sparse PCA in exploring multivariate data.

翻译：先前版本的稀疏主成分分析（PCA）假定特征基（一个 $p \times k$ 矩阵）近似稀疏。我们提出一种方法，假定该 $p \times k$ 矩阵在经过 $k \times k$ 旋转后变得近似稀疏。该算法的最简版本初始化为前 $k$ 个主成分。随后，通过一个 $k \times k$ 正交旋转使主成分近似稀疏。最后，对旋转后的主成分应用软阈值处理。该方法与先前方法不同，因为它利用正交旋转逼近稀疏基。其一个结果是，稀疏成分不必是主导特征向量，而是它们的混合。由此，我们提出了一种新的（旋转后的）稀疏PCA基。此外，我们的方法避免了“收缩”及其所需的多重调优参数。我们的稀疏PCA框架具有普适性；例如，它自然地扩展到数据矩阵的双向分析，以同时降低行和列的维度。我们提供的证据表明，在相同稀疏度水平下，与替代方法相比，所提出的稀疏PCA方法更稳定，并能解释更多的方差。通过三个应用——图像的稀疏编码、转录组测序数据分析以及社交网络的大规模聚类，我们展示了稀疏PCA在探索多变量数据中的现代实用性。

相关内容

PCA

关注 3

在统计中，主成分分析（PCA）是一种通过最大化每个维度的方差来将较高维度空间中的数据投影到较低维度空间中的方法。给定二维，三维或更高维空间中的点集合，可以将“最佳拟合”线定义为最小化从点到线的平均平方距离的线。可以从垂直于第一条直线的方向类似地选择下一条最佳拟合线。重复此过程会产生一个正交的基础，其中数据的不同单个维度是不相关的。这些基向量称为主成分。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日