Deep neural networks are susceptible to human imperceptible adversarial perturbations. One of the strongest defense mechanisms is \emph{Adversarial Training} (AT). In this paper, we aim to address two predominant problems in AT. First, there is still little consensus on how to set hyperparameters with a performance guarantee for AT research, and customized settings impede a fair comparison between different model designs in AT research. Second, the robustly trained neural networks struggle to generalize well and suffer from tremendous overfitting. This paper focuses on the primary AT framework - Projected Gradient Descent Adversarial Training (PGD-AT). We approximate the dynamic of PGD-AT by a continuous-time Stochastic Differential Equation (SDE), and show that the diffusion term of this SDE determines the robust generalization. An immediate implication of this theoretical finding is that robust generalization is positively correlated with the ratio between learning rate and batch size. We further propose a novel approach, \emph{Diffusion Enhanced Adversarial Training} (DEAT), to manipulate the diffusion term to improve robust generalization with virtually no extra computational burden. We theoretically show that DEAT obtains a tighter generalization bound than PGD-AT. Our empirical investigation is extensive and firmly attests that DEAT universally outperforms PGD-AT by a significant margin.
翻译:深度神经网络对难以察觉的人为对抗扰动十分敏感。其中最强防御机制之一是**对抗训练**(Adversarial Training, AT)。本文旨在解决AT中两个主要问题:第一,如何在AT研究中设置具有性能保证的超参数尚未达成共识,而自定义设置阻碍了不同模型设计间的公平比较;第二,经过鲁棒训练的神经网络泛化能力欠佳且存在严重过拟合。本文聚焦于基础AT框架——投影梯度下降对抗训练(Projected Gradient Descent Adversarial Training, PGD-AT)。我们通过连续时间的随机微分方程(Stochastic Differential Equation, SDE)近似PGD-AT的动态过程,并证明该SDE的扩散项决定了鲁棒泛化性能。这一理论发现直接表明:鲁棒泛化能力与学习率及批处理大小的比值呈正相关。我们进一步提出一种新方法——**扩散增强对抗训练**(Diffusion Enhanced Adversarial Training, DEAT),通过调控扩散项以提升鲁棒泛化能力,且几乎不增加额外计算负担。理论分析表明,DEAT比PGD-AT具有更紧的泛化界。广泛的实证研究结果一致证实,DEAT在性能上显著且普遍优于PGD-AT。