Deep neural networks are susceptible to adversarial attacks, which can compromise their performance and accuracy. Adversarial Training (AT) has emerged as a popular approach for protecting neural networks against such attacks. However, a key challenge of AT is robust overfitting, where the network's robust performance on test data deteriorates with further training, thus hindering generalization. Motivated by the concept of active forgetting in the brain, we introduce a novel learning paradigm called "Forget to Mitigate Overfitting (FOMO)". FOMO alternates between the forgetting phase, which randomly forgets a subset of weights and regulates the model's information through weight reinitialization, and the relearning phase, which emphasizes learning generalizable features. Our experiments on benchmark datasets and adversarial attacks show that FOMO alleviates robust overfitting by significantly reducing the gap between the best and last robust test accuracy while improving the state-of-the-art robustness. Furthermore, FOMO provides a better trade-off between standard and robust accuracy, outperforming baseline adversarial methods. Finally, our framework is robust to AutoAttacks and increases generalization in many real-world scenarios.
翻译:深度神经网络易受对抗攻击,此类攻击会损害其性能与准确性。对抗训练作为保护神经网络免受此类攻击的常用方法已得到广泛应用,但其面临的核心挑战是鲁棒过拟合——网络在测试数据上的鲁棒性能会随训练进程持续恶化,从而阻碍泛化能力。受大脑主动遗忘机制的启发,我们提出名为"以遗忘缓解过拟合"的新型学习范式。FOMO在遗忘阶段与重学习阶段之间交替进行:遗忘阶段通过随机遗忘部分权重并借助权重重新初始化来调控模型信息,重学习阶段则专注于学习可泛化特征。在基准数据集与对抗攻击实验中的结果表明,FOMO通过显著缩小最佳鲁棒测试准确率与最终鲁棒测试准确率之间的差距来缓解鲁棒过拟合,同时提升了当前最优的鲁棒性水平。此外,FOMO在标准准确率与鲁棒准确率之间实现了更优权衡,表现优于基线对抗方法。最后,本框架对AutoAttack攻击具有鲁棒性,并在多种现实场景中增强了泛化能力。