Adversarial training has shown promise in building robust models against adversarial examples. A major drawback of adversarial training is the computational overhead introduced by the generation of adversarial examples. To overcome this limitation, adversarial training based on single-step attacks has been explored. Previous work improves the single-step adversarial training from different perspectives, e.g., sample initialization, loss regularization, and training strategy. Almost all of them treat the underlying model as a black box. In this work, we propose to exploit the interior building blocks of the model to improve efficiency. Specifically, we propose to dynamically sample lightweight subnetworks as a surrogate model during training. By doing this, both the forward and backward passes can be accelerated for efficient adversarial training. Besides, we provide theoretical analysis to show the model robustness can be improved by the single-step adversarial training with sampled subnetworks. Furthermore, we propose a novel sampling strategy where the sampling varies from layer to layer and from iteration to iteration. Compared with previous methods, our method not only reduces the training cost but also achieves better model robustness. Evaluations on a series of popular datasets demonstrate the effectiveness of the proposed FB-Better. Our code has been released at https://github.com/jiaxiaojunQAQ/FP-Better.
翻译:对抗训练在构建抗对抗样本的鲁棒模型方面展现出潜力,但其主要缺陷在于生成对抗样本所带来的计算开销。为克服这一局限,基于单步攻击的对抗训练得到探索。先前工作从不同角度改进单步对抗训练,例如样本初始化、损失正则化及训练策略,但几乎都将其底层模型视为黑盒。本文提出利用模型内部构建块提升效率——具体而言,我们在训练过程中动态采样轻量子网络作为替代模型。通过此举,前向与反向传播均可加速,从而实现高效对抗训练。此外,我们通过理论分析表明,采样子网络进行单步对抗训练可提升模型鲁棒性。更进一步,我们提出一种逐层且逐迭代可变的创新采样策略。与先前方法相比,本方法不仅能降低训练成本,还能获得更优的模型鲁棒性。在多个主流数据集上的评估验证了所提FB-Better方法的有效性。代码已开源至https://github.com/jiaxiaojunQAQ/FP-Better。