Deep neural networks are susceptible to human imperceptible adversarial perturbations. One of the strongest defense mechanisms is \emph{Adversarial Training} (AT). In this paper, we aim to address two predominant problems in AT. First, there is still little consensus on how to set hyperparameters with a performance guarantee for AT research, and customized settings impede a fair comparison between different model designs in AT research. Second, the robustly trained neural networks struggle to generalize well and suffer from tremendous overfitting. This paper focuses on the primary AT framework - Projected Gradient Descent Adversarial Training (PGD-AT). We approximate the dynamic of PGD-AT by a continuous-time Stochastic Differential Equation (SDE), and show that the diffusion term of this SDE determines the robust generalization. An immediate implication of this theoretical finding is that robust generalization is positively correlated with the ratio between learning rate and batch size. We further propose a novel approach, \emph{Diffusion Enhanced Adversarial Training} (DEAT), to manipulate the diffusion term to improve robust generalization with virtually no extra computational burden. We theoretically show that DEAT obtains a tighter generalization bound than PGD-AT. Our empirical investigation is extensive and firmly attests that DEAT universally outperforms PGD-AT by a significant margin.
翻译:深度神经网络容易受到人类难以察觉的对抗性扰动的影响。其中一种最强的防御机制是《对抗训练》(AT)。本文旨在解决AT中两个主要问题。首先,关于如何以性能保证的方式设置AT研究的超参数仍缺乏共识,而定制化设置阻碍了AT研究中不同模型设计之间的公平比较。其次,经过鲁棒训练的神经网络难以良好泛化,并遭受严重的过拟合。本文聚焦于主要的AT框架——投影梯度下降对抗训练(PGD-AT)。我们通过连续时间随机微分方程(SDE)近似PGD-AT的动力学过程,并表明该SDE的扩散项决定了鲁棒泛化性能。该理论发现的一个直接推论是:鲁棒泛化与学习率和批大小之比呈正相关。我们进一步提出一种新方法——《扩散增强对抗训练》(DEAT),通过调控扩散项来提升鲁棒泛化性能,且几乎不增加额外计算负担。我们从理论上证明DEAT比PGD-AT具有更紧的泛化界。我们的实证研究广泛而充分,明确证实DEAT在性能上显著且普遍优于PGD-AT。