This paper addresses model order selection under large-dimensional, correlated, non-Gaussian noise. Sources are assumed to be embedded in additive Complex Elliptically Symmetric (CES) noise with an unknown Toeplitz-structured scatter matrix. We propose a two-stage robust framework: (i) a noise-whitening step based on a Toeplitz-rectified $M$-estimator of the scatter matrix, and (ii) signal subspace rank inference via large-dimensional Random Matrix Theory (RMT). Almost sure consistency of the proposed estimators is established, together with explicit RMT eigenvalue upper bounds separating signal from noise components, in the regime where the observation dimension $m$ and the sample size $N$ grow proportionally. Three estimation branches are derived, based respectively on the sample covariance matrix (SCM), Maronna's $M$-estimator, and the distribution-free Tyler $M$-estimator for whitening. The methodology is validated on synthetic data, real hyperspectral images, EEG recordings, and financial data, with significant gains over AIC and unwhitened methods.
翻译:本文研究在大维、相关、非高斯噪声条件下的模型阶数选择问题。假设源信号叠加在具有未知Toeplitz结构散点矩阵的复椭圆对称(CES)加性噪声中。我们提出一种两阶段鲁棒框架:(i)基于Toeplitz修正的散点矩阵$M$估计量进行噪声白化处理;(ii)通过大维随机矩阵理论(RMT)进行信号子空间秩推断。在观测维度$m$与样本量$N$成比例增长的条件下,建立了所提估计量的几乎必然一致性,同时给出了分离信号与噪声分量的显式RMT特征值上界。基于白化处理分别采用样本协方差矩阵(SCM)、Maronna的$M$估计量以及无分布假设的Tyler $M$估计量,推导出三种估计分支。该方法在合成数据、真实高光谱图像、脑电图记录及金融数据上得到验证,相较于AIC及未白化方法具有显著性能提升。