In this paper, we focus on distributed estimation and support recovery for high-dimensional linear quantile regression. Quantile regression is a popular alternative tool to the least squares regression for robustness against outliers and data heterogeneity. However, the non-smoothness of the check loss function poses big challenges to both computation and theory in the distributed setting. To tackle these problems, we transform the original quantile regression into the least-squares optimization. By applying a double-smoothing approach, we extend a previous Newton-type distributed approach without the restrictive independent assumption between the error term and covariates. An efficient algorithm is developed, which enjoys high computation and communication efficiency. Theoretically, the proposed distributed estimator achieves a near-oracle convergence rate and high support recovery accuracy after a constant number of iterations. Extensive experiments on synthetic examples and a real data application further demonstrate the effectiveness of the proposed method.
翻译:本文聚焦于高维线性分位数回归的分布式估计与支持恢复问题。分位数回归作为最小二乘回归的稳健替代方法,能够有效应对异常值和数据异质性。然而,检验损失函数的非光滑性给分布式场景下的计算与理论分析带来了巨大挑战。为解决这些问题,我们将原始分位数回归转化为最小二乘优化问题。通过采用双重平滑方法,我们扩展了已有的牛顿型分布式方法,并放宽了误差项与协变量之间需满足独立假设的限制条件。本文提出了一种高效算法,兼具高计算效率与通信效率。理论上,所提出的分布式估计量在常数次迭代后即可达到近似最优的收敛速度与高精度的支持恢复性能。在合成数据与真实数据应用上的大量实验进一步验证了该方法的有效性。