Rethinking Adversarial Training with A Simple Baseline

We report competitive results on RobustBench for CIFAR and SVHN using a simple yet effective baseline approach. Our approach involves a training protocol that integrates rescaled square loss, cyclic learning rates, and erasing-based data augmentation. The outcomes we have achieved are comparable to those of the model trained with state-of-the-art techniques, which is currently the predominant choice for adversarial training. Our baseline, referred to as SimpleAT, yields three novel empirical insights. (i) By switching to square loss, the accuracy is comparable to that obtained by using both de-facto training protocol plus data augmentation. (ii) One cyclic learning rate is a good scheduler, which can effectively reduce the risk of robust overfitting. (iii) Employing rescaled square loss during model training can yield a favorable balance between adversarial and natural accuracy. In general, our experimental results show that SimpleAT effectively mitigates robust overfitting and consistently achieves the best performance at the end of training. For example, on CIFAR-10 with ResNet-18, SimpleAT achieves approximately 52% adversarial accuracy against the current strong AutoAttack. Furthermore, SimpleAT exhibits robust performance on various image corruptions, including those commonly found in CIFAR-10-C dataset. Finally, we assess the effectiveness of these insights through two techniques: bias-variance analysis and logit penalty methods. Our findings demonstrate that all of these simple techniques are capable of reducing the variance of model predictions, which is regarded as the primary contributor to robust overfitting. In addition, our analysis also uncovers connections with various advanced state-of-the-art methods.

翻译：我们在CIFAR和SVHN数据集上报告了使用一种简单而有效的基线方法在RobustBench上取得的竞争性结果。该方法采用了一项训练协议，集成了重新缩放的平方损失、循环学习率和基于擦除的数据增强。我们所取得的结果与当前对抗训练中最常用的先进技术训练的模型相当。该基线方法被称为SimpleAT，它提供了三个新颖的实验见解：（i）通过切换为平方损失，其准确率与使用实际训练协议加上数据增强的结果相当；（ii）单一循环学习率是一种良好的调度策略，能有效降低鲁棒过拟合的风险；（iii）在模型训练中采用重新缩放的平方损失，可以在对抗准确率和自然准确率之间取得有利平衡。总体而言，我们的实验结果表明，SimpleAT有效缓解了鲁棒过拟合，并在训练结束时始终获得最佳性能。例如，在CIFAR-10数据集上使用ResNet-18模型，SimpleAT针对当前强力的AutoAttack攻击实现了约52%的对抗准确率。此外，SimpleAT在多种图像损坏（包括CIFAR-10-C数据集中的常见损坏类型）上表现出鲁棒性能。最后，我们通过两种技术（偏差-方差分析和logit惩罚方法）评估了这些见解的有效性。研究结果表明，所有这些简单技术都降低了模型预测的方差，而这被认为是鲁棒过拟合的主要成因。此外，我们的分析还揭示了与多种先进前沿方法的联系。