Two-sample spiked model is an important issue in multivariate statistical inference. This paper focuses on testing the number of spikes in a high-dimensional generalized two-sample spiked model, which is free of Gaussian population assumption and the diagonal or block-wise diagonal restriction of population covariance matrix, and the spiked eigenvalues are not necessary required to be bounded. In order to determine the number of spikes, we first propose a general test, which relies on the partial linear spectral statistics. We establish its asymptotic normality under the null hypothesis. Then we apply the conclusion to two statistical problem, variable selection in large-dimensional linear regression and change point detection when change points and additive outliers exist simultaneously. Simulations and empirical analysis are conducted to illustrate the good performance of our methods.
翻译:双样本尖峰模型是多元统计推断中的重要课题。本文聚焦于高维广义双样本尖峰模型中尖峰数量的检验问题,该模型不依赖于高斯总体假设,也不要求总体协方差矩阵具有对角或分块对角约束,同时尖峰特征值无需有界。为确定尖峰数量,我们首先提出一种基于部分线性谱统计量的通用检验方法,并建立了其在原假设下的渐近正态性。随后将该结论应用于两个统计问题:大规模线性回归中的变量选择,以及同时存在变点与加法异常值时的变点检测。模拟实验与实证分析表明,所提方法具有良好的性能。