Despite the fact that adversarial training has become the de facto method for improving the robustness of deep neural networks, it is well-known that vanilla adversarial training suffers from daunting robust overfitting, resulting in unsatisfactory robust generalization. A number of approaches have been proposed to address these drawbacks such as extra regularization, adversarial weights perturbation, and training with more data over the last few years. However, the robust generalization improvement is yet far from satisfactory. In this paper, we approach this challenge with a brand new perspective -- refining historical optimization trajectories. We propose a new method named \textbf{Weighted Optimization Trajectories (WOT)} that leverages the optimization trajectories of adversarial training in time. We have conducted extensive experiments to demonstrate the effectiveness of WOT under various state-of-the-art adversarial attacks. Our results show that WOT integrates seamlessly with the existing adversarial training methods and consistently overcomes the robust overfitting issue, resulting in better adversarial robustness. For example, WOT boosts the robust accuracy of AT-PGD under AA-$L_{\infty}$ attack by 1.53\% $\sim$ 6.11\% and meanwhile increases the clean accuracy by 0.55\%$\sim$5.47\% across SVHN, CIFAR-10, CIFAR-100, and Tiny-ImageNet datasets.
翻译:尽管对抗训练已成为提升深度神经网络鲁棒性的事实标准方法,但原始对抗训练存在显著的鲁棒过拟合问题,导致鲁棒泛化性能不尽人意。近年来,研究者提出了诸多方法应对这些缺陷,例如引入额外正则化、对抗权重扰动以及利用更多数据进行训练等技术。然而,当前鲁棒泛化性能的提升仍远未达到理想水平。本文从一个全新视角——精细化历史优化轨迹——来应对这一挑战。我们提出了一种名为\textbf{加权优化轨迹(WOT)}的新方法,该方法利用对抗训练在时间维度上的优化轨迹。通过大量实验,我们在多种先进对抗攻击下验证了WOT的有效性。实验结果表明,WOT能与现有对抗训练方法无缝集成,并持续克服鲁棒过拟合问题,从而获得更优的对抗鲁棒性。例如,在SVHN、CIFAR-10、CIFAR-100和Tiny-ImageNet数据集上,WOT使基于AA-$L_{\infty}$攻击的AT-PGD方法鲁棒准确率提升1.53\%$\sim$6.11\%,同时干净准确率提升0.55\%$\sim$5.47\%。