The sensitivity of machine learning algorithms to outliers, particularly in high-dimensional spaces, necessitates the development of robust methods. Within the framework of $\epsilon$-contamination model, where the adversary can inspect and replace up to $\epsilon$ fraction of the samples, a fundamental open question is determining the optimal rates for robust stochastic convex optimization (robust SCO), provided the samples under $\epsilon$-contamination. We develop novel algorithms that achieve minimax-optimal excess risk (up to logarithmic factors) under the $\epsilon$-contamination model. Our approach advances beyonds existing algorithms, which are not only suboptimal but also constrained by stringent requirements, including Lipschitzness and smoothness conditions on sample functions.Our algorithms achieve optimal rates while removing these restrictive assumptions, and notably, remain effective for nonsmooth but Lipschitz population risks.
翻译:机器学习算法对异常值的敏感性,尤其是在高维空间中,使得开发鲁棒方法成为必要。在$\epsilon$-污染模型的框架下,其中对手可以检查并替换最多$\epsilon$比例的样本,一个基本的开放性问题是在$\epsilon$-污染条件下确定鲁棒随机凸优化(鲁棒SCO)的最优速率。我们开发了新颖的算法,在$\epsilon$-污染模型下实现了极小极大最优的超出风险(直至对数因子)。我们的方法超越了现有算法,后者不仅次优,而且受到严格要求的限制,包括样本函数的Lipschitz性和光滑性条件。我们的算法在移除这些限制性假设的同时实现了最优速率,并且值得注意的是,对于非光滑但Lipschitz的总体风险仍然有效。