We study the change point detection problem for high-dimensional linear regression models. The existing literature mainly focused on the change point estimation with stringent sub-Gaussian assumptions on the errors. In practice, however, there is no prior knowledge about the existence of a change point or the tail structures of errors. To address these issues, in this paper, we propose a novel tail-adaptive approach for simultaneous change point testing and estimation. The method is built on a new loss function which is a weighted combination between the composite quantile and least squared losses, allowing us to borrow information of the possible change points from both the conditional mean and quantiles. For the change point testing, based on the adjusted $L_2$-norm aggregation of a weighted score CUSUM process, we propose a family of individual testing statistics with different weights to account for the unknown tail structures. Combining the individual tests, a tail-adaptive test is further constructed that is powerful for sparse alternatives of regression coefficients' changes under various tail structures. For the change point estimation, a family of argmax-based individual estimators is proposed once a change point is detected. In theory, for both individual and tail-adaptive tests, the bootstrap procedures are proposed to approximate their limiting null distributions. Under some mild conditions, we justify the validity of the new tests in terms of size and power under the high-dimensional setup. The corresponding change point estimators are shown to be rate optimal up to a logarithm factor. Moreover, combined with the wild binary segmentation technique, a new algorithm is proposed to detect multiple change points in a tail-adaptive manner. Extensive numerical results are conducted to illustrate the appealing performance of the proposed method.
翻译:我们研究高维线性回归模型的变化点检测问题。现有文献主要关注在误差项满足严格次高斯假设下的变化点估计。然而在实际应用中,我们缺乏关于变化点是否存在或误差尾部结构的先验知识。为解决这些问题,本文提出了一种新颖的尾部自适应方法,用于同时进行变化点检验与估计。该方法基于一种新的损失函数,该函数是复合分位数损失与最小二乘损失的加权组合,使我们能够从条件均值与分位数两个维度借取可能的变点信息。在变化点检验方面,基于加权得分CUSUM过程的调整$L_2$范数聚合,我们提出了一族具有不同权重的个体检验统计量,以应对未知的尾部结构。通过组合这些个体检验,我们进一步构建了尾部自适应检验,该检验在回归系数稀疏变化的各种尾部结构下均具有良好功效。在变化点估计方面,一旦检测到变化点,我们提出了一族基于argmax的个体估计量。理论上,对于个体检验与尾部自适应检验,均提出了自举程序来逼近其零分布极限。在温和条件下,我们验证了新检验在高维设定下容量与功效的有效性。相应变化点估计量的收敛速率在剔除对数因子后达到最优。此外,结合野生二元分割技术,我们提出了一种尾部自适应多变化点检测算法。大量数值结果验证了所提方法的优越性能。