Consider large signal-plus-noise data matrices of the form $S + \Sigma^{1/2} X$, where $S$ is a low-rank deterministic signal matrix and the noise covariance matrix $\Sigma$ can be anisotropic. We establish the asymptotic joint distribution of its spiked singular values when the dimensionality and sample size are comparably large and the signals are supercritical under general assumptions concerning the structure of $(S, \Sigma)$ and the distribution of the random noise $X$. It turns out that the asymptotic distributions exhibit nonuniversality in the sense of dependence on the distributions of the entries of $X$, which contrasts with what has previously been established for the spiked sample eigenvalues in the context of spiked population models. Such a result yields the asymptotic distribution of the sample spiked eigenvalues associated with mixture models. We also explore the application of these findings in detecting mean heterogeneity of data matrices.
翻译:考虑形如$S + \Sigma^{1/2} X$的大信号加噪声数据矩阵,其中$S$是低秩确定性信号矩阵,噪声协方差矩阵$\Sigma$可能是各向异性的。在维度和样本量相当大且信号处于超临界状态的条件下,我们建立了其尖峰奇异值的渐近联合分布,该分布基于关于$(S, \Sigma)$结构以及随机噪声$X$分布的一般假设。结果发现,渐近分布表现出非普适性,即依赖于$X$元素的分布,这与先前在尖峰总体模型背景下建立的尖峰样本特征值结果形成对比。这一结论给出了混合模型相关样本尖峰特征值的渐近分布。我们还探讨了这些发现在检测数据矩阵均值异质性中的应用。