Bayesian factor analysis is routinely used for dimensionality reduction in modeling of high-dimensional covariance matrices. Factor analytic decompositions express the covariance as a sum of a low rank and diagonal matrix. In practice, Gibbs sampling algorithms are typically used for posterior computation, alternating between updating the latent factors, loadings, and residual variances. In this article, we exploit a blessing of dimensionality to develop a provably accurate posterior approximation for the covariance matrix that bypasses the need for Gibbs or other variants of Markov chain Monte Carlo sampling. Our proposed Factor Analysis with BLEssing of dimensionality (FABLE) approach relies on a first-stage singular value decomposition (SVD) to estimate the latent factors, and then defines a jointly conjugate prior for the loadings and residual variances. The accuracy of the resulting posterior approximation for the covariance improves with increasing samples as well as increasing dimensionality. We show that FABLE has excellent performance in high-dimensional covariance matrix estimation, including producing well-calibrated credible intervals, both theoretically and through simulation experiments. We also demonstrate the strength of our approach in terms of accurate inference and computational efficiency by applying it to a gene expression dataset.
翻译:贝叶斯因子分析常用于高维协方差矩阵建模中的降维处理。因子分析分解将协方差表达为低秩矩阵与对角矩阵之和。实践中通常采用吉布斯采样算法进行后验计算,交替更新潜变量因子、载荷矩阵及残差方差。本文利用维度祝福现象,提出一种可证明精确的协方差矩阵后验近似方法,无需依赖吉布斯采样或其他马尔可夫链蒙特卡洛采样变体。我们提出的基于维度祝福的因子分析(FABLE)方法,首先通过奇异值分解(SVD)估计潜变量因子,进而为载荷矩阵与残差方差构建联合共轭先验。所得协方差后验近似的精度随样本量与维度的增加而提升。理论证明与仿真实验表明,FABLE在高维协方差矩阵估计中表现卓越,包括能生成校准良好的可信区间。通过将其应用于基因表达数据集,我们进一步验证了该方法在精确推断与计算效率方面的优势。