We first elucidate various fundamental properties of optimal adversarial predictors: the structure of optimal adversarial convex predictors in terms of optimal adversarial zero-one predictors, bounds relating the adversarial convex loss to the adversarial zero-one loss, and the fact that continuous predictors can get arbitrarily close to the optimal adversarial error for both convex and zero-one losses. Applying these results along with new Rademacher complexity bounds for adversarial training near initialization, we prove that for general data distributions and perturbation sets, adversarial training on shallow networks with early stopping and an idealized optimal adversary is able to achieve optimal adversarial test error. By contrast, prior theoretical work either considered specialized data distributions or only provided training error guarantees.
翻译:我们首先阐明了最优对抗预测器的若干基本性质:最优对抗零一预测器视角下最优对抗凸预测器的结构、对抗凸损失与对抗零一损失之间的界限关系,以及连续预测器在凸损失和零一损失下均可任意逼近最优对抗误差这一事实。结合这些结果以及针对近初始化对抗训练的新Rademacher复杂度界,我们证明:对于一般数据分布和扰动集,采用早停法和理想化最优对手的浅层网络对抗训练能够实现最优对抗测试误差。相比之下,先前的理论工作要么考虑特殊数据分布,要么仅提供训练误差保证。