Consider the spiked Wigner model \[ X = \sum_{i = 1}^k \lambda_i u_i u_i^\top + \sigma G, \] where $G$ is an $N \times N$ GOE random matrix, and the eigenvalues $\lambda_i$ are all spiked, i.e. above the Baik-Ben Arous-P\'ech\'e (BBP) threshold $\sigma$. We consider AIC-type model selection criteria of the form \[ -2 \, (\text{maximised log-likelihood}) + \gamma \, (\text{number of parameters}) \] for estimating the number $k$ of spikes. For $\gamma > 2$, the above criterion is strongly consistent provided $\lambda_k > \lambda_{\gamma}$, where $\lambda_{\gamma}$ is a threshold strictly above the BBP threshold, whereas for $\gamma < 2$, it almost surely overestimates $k$. Although AIC (which corresponds to $\gamma = 2$) is not strongly consistent, we show that taking $\gamma = 2 + \delta_N$, where $\delta_N \to 0$ and $\delta_N \gg N^{-2/3}$, results in a weakly consistent estimator of $k$. We also show that a certain soft minimiser of AIC is strongly consistent.
翻译:考虑尖峰Wigner模型:
\[ X = \sum_{i = 1}^k \lambda_i u_i u_i^\top + \sigma G, \]
其中$G$为$N \times N$的GOE随机矩阵,特征值$\lambda_i$均为尖峰特征值,即均高于Baik-Ben Arous-Péché (BBP)阈值$\sigma$。我们采用AIC型模型选择准则:
\[ -2 \, (\text{最大化对数似然}) + \gamma \, (\text{参数个数}) \]
用于估计尖峰个数$k$。当$\gamma > 2$时,若满足$\lambda_k > \lambda_{\gamma}$(其中$\lambda_{\gamma}$为严格高于BBP阈值的阈值),上述准则具有强一致性;而当$\gamma < 2$时,该准则几乎必然高估$k$。尽管AIC(对应$\gamma = 2$)不具有强一致性,我们证明取$\gamma = 2 + \delta_N$(其中$\delta_N \to 0$且$\delta_N \gg N^{-2/3}$)可得到$k$的弱一致估计量。同时,我们证明AIC的某种软最小化器具有强一致性。