Screening and working set techniques are important approaches to reducing the size of an optimization problem. They have been widely used in accelerating first-order methods for solving large-scale sparse learning problems. In this paper, we develop a new screening method called Newton screening (NS) which is a generalized Newton method with a built-in screening mechanism. We derive an equivalent KKT system for the Lasso and utilize a generalized Newton method to solve the KKT equations. Based on this KKT system, a built-in working set with a relatively small size is first determined using the sum of primal and dual variables generated from the previous iteration, then the primal variable is updated by solving a least-squares problem on the working set and the dual variable updated based on a closed-form expression. Moreover, we consider a sequential version of Newton screening (SNS) with a warm-start strategy. We show that NS possesses an optimal convergence property in the sense that it achieves one-step local convergence. Under certain regularity conditions on the feature matrix, we show that SNS hits a solution with the same signs as the underlying true target and achieves a sharp estimation error bound with high probability. Simulation studies and real data analysis support our theoretical results and demonstrate that SNS is faster and more accurate than several state-of-the-art methods in our comparative studies.
翻译:筛选和工作集技术是缩小优化问题规模的重要方法,已被广泛用于加速一阶方法以解决大规模稀疏学习问题。本文提出了一种新的筛选方法——牛顿筛选(NS),这是一种内置筛选机制的广义牛顿法。我们为Lasso推导了等价的KKT系统,并利用广义牛顿法求解KKT方程。基于该KKT系统,首先利用前一次迭代生成的原变量与对偶变量之和确定一个规模相对较小的内置工作集,然后通过在工作集上求解最小二乘问题更新原变量,并基于闭式表达式更新对偶变量。此外,我们考虑采用热启动策略的序贯牛顿筛选(SNS)。我们证明NS具有最优收敛性质,即实现一步局部收敛。在特征矩阵满足特定正则性条件时,我们证明SNS能以高概率获得与真实目标符号相同的解,并达到尖锐的估计误差界。模拟研究与实际数据分析支持了我们的理论结果,并表明在对比研究中SNS比多种现有先进方法更快且更精确。