High-dimensional classification problems often rely on the Lasso-penalized linear Support Vector Machines (SVMs). However, the double non-smoothness induced by the hinge loss and Lasso penalty in this model makes statistical inference challenging and impedes computational efficiency. In this paper, we propose a unified inference framework in both offline and online settings. In the offline case, by applying a convolution smoothing technique to the hinge loss, we construct a debiased estimator that eliminates the shrinkage bias, thereby building a valid confidence interval. For online streaming data, we develop a real-time estimator and inference procedure that relies only on summary statistics of historical data. Theoretically, we provide rigorous proofs for the asymptotic normality of our offline and online debiased estimators. Simulation studies and real data applications demonstrate that our methods achieve valid statistical inference and improved computational efficiency.
翻译:高维分类问题通常依赖于Lasso惩罚的线性支持向量机(SVMs)。然而,该模型中由铰链损失和Lasso惩罚引起的双重非光滑性使得统计推断具有挑战性,并阻碍了计算效率。本文提出了一种适用于离线与在线场景的统一推断框架。在离线情况下,通过对铰链损失应用卷积平滑技术,我们构建了一个消除收缩偏差的去偏估计量,从而建立了有效的置信区间。针对在线流式数据,我们开发了一种仅依赖于历史数据摘要统计量的实时估计与推断方法。理论上,我们严格证明了离线与在线去偏估计量的渐近正态性。模拟实验与真实数据应用表明,我们的方法能够实现有效的统计推断并提升计算效率。