Probabilistic principal component analysis (PPCA) is currently one of the most used statistical tools to reduce the ambient dimension of the data. From multidimensional scaling to the imputation of missing data, PPCA has a broad spectrum of applications ranging from science and engineering to quantitative finance. Despite this wide applicability in various fields, hardly any theoretical guarantees exist to justify the soundness of the maximal likelihood (ML) solution for this model. In fact, it is well known that the maximum likelihood estimation (MLE) can only recover the true model parameters up to a rotation. The main obstruction is posed by the inherent identifiability nature of the PPCA model resulting from the rotational symmetry of the parameterization. To resolve this ambiguity, we propose a novel approach using quotient topological spaces and in particular, we show that the maximum likelihood solution is consistent in an appropriate quotient Euclidean space. Furthermore, our consistency results encompass a more general class of estimators beyond the MLE. Strong consistency of the ML estimate and consequently strong covariance estimation of the PPCA model have also been established under a compactness assumption.
翻译:概率主成分分析(PPCA)是目前最常用的数据降维统计工具之一。从多维标度到缺失数据插补,PPCA在科学工程、定量金融等众多领域具有广泛应用。尽管该方法在各领域应用广泛,但几乎缺乏理论保证来验证该模型最大似然(ML)解的合理性。事实上,众所周知最大似然估计(MLE)仅能恢复真实模型参数至多一个旋转。主要障碍源于参数化的旋转对称性导致的PPCA模型固有可识别性问题。为解决这一歧义,我们提出一种基于商拓扑空间的新方法,特别证明了最大似然解在适当的商欧几里得空间中具有一致性。此外,我们的一致性结果覆盖了比MLE更广泛的估计量类别。在紧致性假设下,我们同时建立了ML估计的强一致性及由此推导的PPCA模型协方差估计的强一致性。