The composite quantile regression (CQR) was introduced by Zou and Yuan [Ann. Statist. 36 (2008) 1108--1126] as a robust regression method for linear models with heavy-tailed errors while achieving high efficiency. Its penalized counterpart for high-dimensional sparse models was recently studied in Gu and Zou [IEEE Trans. Inf. Theory 66 (2020) 7132--7154], along with a specialized optimization algorithm based on the alternating direct method of multipliers (ADMM). Compared to the various first-order algorithms for penalized least squares, ADMM-based algorithms are not well-adapted to large-scale problems. To overcome this computational hardness, in this paper we employ a convolution-smoothed technique to CQR, complemented with iteratively reweighted $\ell_1$-regularization. The smoothed composite loss function is convex, twice continuously differentiable, and locally strong convex with high probability. We propose a gradient-based algorithm for penalized smoothed CQR via a variant of the majorize-minimization principal, which gains substantial computational efficiency over ADMM. Theoretically, we show that the iteratively reweighted $\ell_1$-penalized smoothed CQR estimator achieves near-minimax optimal convergence rate under heavy-tailed errors without any moment constraint, and further achieves near-oracle convergence rate under a weaker minimum signal strength condition than needed in Gu and Zou (2020). Numerical studies demonstrate that the proposed method exhibits significant computational advantages without compromising statistical performance compared to two state-of-the-art methods that achieve robustness and high efficiency simultaneously.
翻译:复合分位数回归由Zou和Yuan [Ann. Statist. 36 (2008) 1108–1126]提出,作为针对重尾误差线性模型的一种鲁棒回归方法,同时实现了高统计效率。其在高维稀疏模型下的惩罚版本近期由Gu和Zou [IEEE Trans. Inf. Theory 66 (2020) 7132–7154]研究,并基于交替方向乘子法提出了专用优化算法。与惩罚最小二乘的各种一阶算法相比,基于ADMM的算法难以适用于大规模问题。为克服这一计算困难,本文对复合分位数回归采用卷积平滑技术,并辅以迭代加权$\ell_1$正则化。平滑后的复合损失函数是凸函数、二阶连续可微,且以高概率满足局部强凸性。我们基于最大化-最小化原则的变体,提出了一种梯度类算法求解惩罚平滑复合分位数回归,相较于ADMM显著提升了计算效率。理论方面,我们证明在无任何矩约束的重尾误差下,迭代加权$\ell_1$惩罚平滑复合分位数回归估计量达到近极小化最优收敛速率,并在比Gu和Zou(2020)更弱的最小信号强度条件下进一步达到近乎神谕最优收敛速率。数值实验表明,与两种同时实现鲁棒性和高统计效率的最先进方法相比,所提方法在不牺牲统计性能的前提下展现出显著计算优势。