Quantum principal component analysis (qPCA) is commonly formulated as the extraction of eigenvalues and eigenvectors of a covariance-encoded density operator. Yet in many qPCA settings, the practical objective is simpler: projecting data onto the dominant spectral subspace. In this work, we introduce a projection-first framework, the Filtered Spectral Projection Algorithm (FSPA), which bypasses explicit eigenvalue estimation while preserving the essential spectral structure. FSPA amplifies any nonzero warm-start overlap with the leading principal subspace and remains robust in small-gap and near-degenerate regimes without inducing artificial symmetry breaking in the absence of bias. To connect this approach to classical datasets, we show that for amplitude-encoded centered data, the ensemble density matrix $ρ=\sum_i p_i|ψ_i\rangle\langleψ_i|$ coincides with the covariance matrix. For uncentered data, $ρ$ corresponds to PCA without centering, and we derive eigenvalue interlacing bounds quantifying the deviation from standard PCA. We further show that ensembles of quantum states admit an equivalent centered covariance interpretation. Numerical demonstrations on benchmark datasets, including Breast Cancer Wisconsin and handwritten Digits, show that downstream performance remains stable whenever projection quality is preserved. These results suggest that, in a broad class of qPCA settings, spectral projection is the essential primitive, and explicit eigenvalue estimation is often unnecessary.
翻译:量子主成分分析(qPCA)通常被表述为对协方差编码密度算符的特征值和特征向量的提取。然而在许多qPCA场景中,实际目标更为简单:将数据投影到主导谱子空间。本文提出了一种先投影框架——过滤谱投影算法(FSPA),该算法绕过了显式的特征值估计,同时保留了必要的谱结构。FSPA能放大与主导主子空间的任何非零热启动重叠,并在小间隙和近退化区域保持鲁棒性,且在无偏置情况下不会引入人工对称性破缺。为连接该方法与经典数据集,我们证明:对于幅度编码的中心化数据,系综密度矩阵$ρ=\sum_i p_i|ψ_i\rangle\langleψ_i|$与协方差矩阵一致。对于非中心化数据,$ρ$对应于无中心化的PCA,我们推导了特征值交错界限以量化其与标准PCA的偏差。进一步证明,量子态系综具有等价的中心化协方差解释。在乳腺癌威斯康星数据集和手写数字等基准数据集上的数值实验表明,只要投影质量保持不变,下游性能就保持稳定。这些结果表明,在一大类qPCA场景中,谱投影是基本操作,显式特征值估计往往并非必要。