We study a principal component analysis problem under the spiked Wishart model in which the structure in the signal is captured by a class of union-of-subspace models. This general class includes vanilla sparse PCA as well as its variants with graph sparsity. With the goal of studying these problems under a unified statistical and computational lens, we establish fundamental limits that depend on the geometry of the problem instance, and show that a natural projected power method exhibits local convergence to the statistically near-optimal neighborhood of the solution. We complement these results with end-to-end analyses of two important special cases given by path and tree sparsity in a general basis, showing initialization methods and matching evidence of computational hardness. Overall, our results indicate that several of the phenomena observed for vanilla sparse PCA extend in a natural fashion to its structured counterparts.
翻译:我们研究尖峰Wishart模型下的主成分分析问题,其中信号结构由一类子空间并集模型刻画。这类广义模型不仅包含经典稀疏主成分分析,还涵盖其具有图稀疏性的变体。为在统一的统计与计算框架下研究此类问题,我们建立了依赖于问题实例几何结构的基本理论界限,并证明自然投影幂方法在统计近优解的邻域内具有局部收敛性。进一步地,我们针对一般基下的路径稀疏与树稀疏两个重要特例开展端到端分析,展示了初始化方法以及计算困难性的匹配证据。总体而言,我们的结果表明经典稀疏主成分分析中观测到的若干现象能够以自然方式推广至其结构化对应问题。