Bayesian factor analysis is routinely used for dimensionality reduction in modeling of high-dimensional covariance matrices. Factor analytic decompositions express the covariance as a sum of a low rank and diagonal matrix. In practice, Gibbs sampling algorithms are typically used for posterior computation, alternating between updating the latent factors, loadings, and residual variances. In this article, we exploit a blessing of dimensionality to develop a provably accurate pseudo-posterior for the covariance matrix that bypasses the need for Gibbs or other variants of Markov chain Monte Carlo sampling. Our proposed Factor Analysis with BLEssing of dimensionality (FABLE) approach relies on a first-stage singular value decomposition (SVD) to estimate the latent factors, and then defines a jointly conjugate prior for the loadings and residual variances. The accuracy of the resulting pseudo-posterior for the covariance improves with increasing dimensionality. We show that FABLE has excellent performance in high-dimensional covariance matrix estimation, including producing well calibrated credible intervals, both theoretically and through simulation experiments. We also demonstrate the strength of our approach in terms of accurate inference and computational efficiency by applying it to a gene expression data set.
翻译:贝叶斯因子分析在高维协方差矩阵建模中常被用于降维。因子分析分解将协方差表示为低秩矩阵与对角矩阵之和。实践中通常使用吉布斯采样算法进行后验计算,通过交替更新潜在因子、载荷以及残差方差。本文利用维度祝福现象,提出一种精度可证的协方差矩阵伪后验方法,该方法无需吉布斯或其他马尔可夫链蒙特卡洛采样变体。我们提出的维度祝福因子分析(FABLE)方法首先通过奇异值分解(SVD)估计潜在因子,随后为载荷和残差方差定义联合共轭先验。由此得到的协方差伪后验精度随维度增加而提升。我们通过理论和模拟实验证明,FABLE在高维协方差矩阵估计中表现优异,包括生成校准良好的可信区间。通过将该方法应用于基因表达数据集,我们还展示了其在精确推断和计算效率方面的优势。