State-of-the-art machine learning models can be vulnerable to very small input perturbations that are adversarially constructed. Adversarial training is an effective approach to defend against it. Formulated as a min-max problem, it searches for the best solution when the training data were corrupted by the worst-case attacks. Linear models are among the simple models where vulnerabilities can be observed and are the focus of our study. In this case, adversarial training leads to a convex optimization problem which can be formulated as the minimization of a finite sum. We provide a comparative analysis between the solution of adversarial training in linear regression and other regularization methods. Our main findings are that: (A) Adversarial training yields the minimum-norm interpolating solution in the overparameterized regime (more parameters than data), as long as the maximum disturbance radius is smaller than a threshold. And, conversely, the minimum-norm interpolator is the solution to adversarial training with a given radius. (B) Adversarial training can be equivalent to parameter shrinking methods (ridge regression and Lasso). This happens in the underparametrized region, for an appropriate choice of adversarial radius and zero-mean symmetrically distributed covariates. (C) For $\ell_\infty$-adversarial training -- as in square-root Lasso -- the choice of adversarial radius for optimal bounds does not depend on the additive noise variance. We confirm our theoretical findings with numerical examples.
翻译:最先进的机器学习模型可能容易受到对抗性构造的极小输入扰动的影响。对抗训练是抵御此类攻击的有效方法。该问题以最小-最大化形式表述,旨在寻找训练数据被最坏攻击破坏时的最优解。线性模型是出现可观测脆弱性的简单模型之一,也是本研究的重点。在此情况下,对抗训练可转化为一个凸优化问题,该问题可表述为有限和的最小化。我们比较分析了线性回归中对抗训练的解与其他正则化方法之间的关系。主要发现如下:(A) 当最大扰动半径小于某一阈值时,对抗训练在过参数化区域(参数多于数据)会产生最小范数插值解;反之,最小范数插值器也是给定半径下对抗训练的解。(B) 对抗训练可等价于参数收缩方法(岭回归和Lasso)。这种情况发生在欠参数化区域,且需要选择合适的对抗半径以及零均值对称分布协变量。(C) 对于$\ell_\infty$对抗训练(如平方根Lasso),最优边界对应的对抗半径选择不依赖于加性噪声方差。我们通过数值算例验证了上述理论发现。