Sparse Principal Components Analysis (PCA) has been proposed as a way to improve both interpretability and reliability of PCA. However, use of sparse PCA in practice is hindered by the difficulty of tuning the multiple hyperparameters that control the sparsity of different PCs (the "multiple tuning problem", MTP). Here we present a solution to the MTP using Empirical Bayes methods. We first introduce a general formulation for penalized PCA of a data matrix $\mathbf{X}$, which includes some existing sparse PCA methods as special cases. We show that this formulation also leads to a penalized decomposition of the covariance (or Gram) matrix, $\mathbf{X}^T\mathbf{X}$. We introduce empirical Bayes versions of these penalized problems, in which the penalties are determined by prior distributions that are estimated from the data by maximum likelihood rather than cross-validation. The resulting "Empirical Bayes Covariance Decomposition" provides a principled and efficient solution to the MTP in sparse PCA, and one that can be immediately extended to incorporate other structural assumptions (e.g. non-negative PCA). We illustrate the effectiveness of this approach on both simulated and real data examples.
翻译:稀疏主成分分析作为一种提升主成分分析可解释性与稳健性的方法被提出。然而,稀疏PCA在实际应用中的推广受限于难以调节控制不同主成分稀疏度的多个超参数(即"多重调参问题")。本文提出利用经验贝叶斯方法解决该多重调参问题。我们首先引入数据矩阵$\mathbf{X}$的惩罚主成分分析通用框架,该框架将现有部分稀疏PCA方法作为特例包含其中。研究表明该框架同样能导出协方差(或Gram)矩阵$\mathbf{X}^T\mathbf{X}$的惩罚分解形式。我们进一步提出这些惩罚问题的经验贝叶斯版本,其中惩罚项由先验分布决定,这些先验分布通过最大化似然函数从数据中估计而非交叉验证确定。由此产生的"经验贝叶斯协方差分解"为稀疏PCA的多重调参问题提供了理论严谨且高效的解决方案,并可直接扩展以纳入其他结构假设(如非负PCA)。我们通过模拟实验和真实数据示例验证了该方法的有效性。