We show that label noise exists in adversarial training. Such label noise is due to the mismatch between the true label distribution of adversarial examples and the label inherited from clean examples - the true label distribution is distorted by the adversarial perturbation, but is neglected by the common practice that inherits labels from clean examples. Recognizing label noise sheds insights on the prevalence of robust overfitting in adversarial training, and explains its intriguing dependence on perturbation radius and data quality. Also, our label noise perspective aligns well with our observations of the epoch-wise double descent in adversarial training. Guided by our analyses, we proposed a method to automatically calibrate the label to address the label noise and robust overfitting. Our method achieves consistent performance improvements across various models and datasets without introducing new hyper-parameters or additional tuning.
翻译:我们揭示了对抗训练中存在的标签噪声现象。这种标签噪声源于对抗样本的真实标签分布与继承自干净样本的标签之间的不匹配——对抗扰动扭曲了真实标签分布,但常见做法中直接继承干净样本标签的方式却忽略了这一扭曲。识别标签噪声为理解对抗训练中鲁棒过拟合的普遍性提供了新思路,并解释了其对抗动半径与数据质量的奇妙依赖性。此外,我们的标签噪声视角与观察到的对抗训练中逐轮双下降现象高度吻合。基于分析,我们提出了一种自动校准标签的方法以解决标签噪声与鲁棒过拟合问题。该方法无需引入新超参数或额外调参,即能在多种模型与数据集上实现一致性能提升。