Sparse Principal Components Analysis (PCA) has been proposed as a way to improve both interpretability and reliability of PCA. However, use of sparse PCA in practice is hindered by the difficulty of tuning the multiple hyperparameters that control the sparsity of different PCs (the "multiple tuning problem", MTP). Here we present a solution to the MTP using Empirical Bayes methods. We first introduce a general formulation for penalized PCA of a data matrix $\mathbf{X}$, which includes some existing sparse PCA methods as special cases. We show that this formulation also leads to a penalized decomposition of the covariance (or Gram) matrix, $\mathbf{X}^T\mathbf{X}$. We introduce empirical Bayes versions of these penalized problems, in which the penalties are determined by prior distributions that are estimated from the data by maximum likelihood rather than cross-validation. The resulting "Empirical Bayes Covariance Decomposition" provides a principled and efficient solution to the MTP in sparse PCA, and one that can be immediately extended to incorporate other structural assumptions (e.g. non-negative PCA). We illustrate the effectiveness of this approach on both simulated and real data examples.
翻译:稀疏主成分分析(PCA)被提出作为提升PCA可解释性与可靠性的一种方法。然而,稀疏PCA在实际应用中的推广受到多重超参数调优困难的阻碍,这些超参数控制着不同主成分的稀疏性(即“多重调参问题”,MTP)。本文提出了一种基于经验贝叶斯方法的MTP解决方案。我们首先提出了数据矩阵$\mathbf{X}$的惩罚PCA的一般化表述,该表述将现有的一些稀疏PCA方法包含为特例。我们证明这一表述同样可导出协方差(或格拉姆)矩阵$\mathbf{X}^T\mathbf{X}$的惩罚分解。我们进一步提出了这些惩罚问题的经验贝叶斯版本,其中惩罚项由通过最大似然(而非交叉验证)从数据中估计的先验分布确定。由此得到的“经验贝叶斯协方差分解”为稀疏PCA中的MTP提供了一个原理清晰且高效的解决方案,并且该方案可立即扩展至包含其他结构假设(例如非负PCA)。我们通过模拟数据与真实数据案例展示了该方法的有效性。