Probabilistic principal component analysis (PPCA) is currently one of the most used statistical tools to reduce the ambient dimension of the data. From multidimensional scaling to the imputation of missing data, PPCA has a broad spectrum of applications ranging from science and engineering to quantitative finance. Despite this wide applicability in various fields, hardly any theoretical guarantees exist to justify the soundness of the maximal likelihood (ML) solution for this model. In fact, it is well known that the maximum likelihood estimation (MLE) can only recover the true model parameters up to a rotation. The main obstruction is posed by the inherent identifiability nature of the PPCA model resulting from the rotational symmetry of the parameterization. To resolve this ambiguity, we propose a novel approach using quotient topological spaces and in particular, we show that the maximum likelihood solution is consistent in an appropriate quotient Euclidean space. Furthermore, our consistency results encompass a more general class of estimators beyond the MLE. Strong consistency of the ML estimate and consequently strong covariance estimation of the PPCA model have also been established under a compactness assumption.
翻译:概率主成分分析(PPCA)是目前最常用的数据环境维度约简统计工具之一,其应用范围涵盖从多维尺度分析到缺失数据插补,涉及科学工程乃至量化金融等多个领域。尽管PPCA在众多领域具有广泛适用性,但至今鲜有理论保证该模型最大似然(ML)解的合理性。事实上,众所周知最大似然估计(MLE)仅能恢复至旋转不确定程度的真实模型参数。这一障碍源于PPCA模型参数化旋转对称性导致的固有不可识别性。为解决此模糊性,我们提出基于商拓扑空间的新方法,并特别证明了最大似然解在适当商欧几里得空间中的相合性。此外,我们的一致性结论涵盖超越MLE的更广泛估计量类。在紧性假设条件下,我们还建立了ML估计的强相合性,进而实现了PPCA模型协方差矩阵的强相合估计。