Similar to surprising performance in the standard deep learning, deep nets trained by adversarial training also generalize well for $\textit{unseen clean data (natural data)}$. However, despite adversarial training can achieve low robust training error, there exists a significant $\textit{robust generalization gap}$. We call this phenomenon the $\textit{Clean Generalization and Robust Overfitting (CGRO)}$. In this work, we study the CGRO phenomenon in adversarial training from two views: $\textit{representation complexity}$ and $\textit{training dynamics}$. Specifically, we consider a binary classification setting with $N$ separated training data points. $\textit{First}$, we prove that, based on the assumption that we assume there is $\operatorname{poly}(D)$-size clean classifier (where $D$ is the data dimension), ReLU net with only $O(N D)$ extra parameters is able to leverages robust memorization to achieve the CGRO, while robust classifier still requires exponential representation complexity in worst case. $\textit{Next}$, we focus on a structured-data case to analyze training dynamics, where we train a two-layer convolutional network with $O(N D)$ width against adversarial perturbation. We then show that a three-stage phase transition occurs during learning process and the network provably converges to robust memorization regime, which thereby results in the CGRO. $\textit{Besides}$, we also empirically verify our theoretical analysis by experiments in real-image recognition datasets.
翻译:类似于标准深度学习中的惊人表现,经过对抗训练的深度网络也能对$\textit{未见过的干净数据(自然数据)}$实现良好泛化。然而,尽管对抗训练能达到较低的鲁棒训练误差,但仍存在显著的$\textit{鲁棒泛化差距}$。我们将这一现象称为$\textit{干净泛化与鲁棒过拟合(CGRO)}$。本文从$\textit{表示复杂度}$和$\textit{训练动态}$两个角度研究对抗训练中的CGRO现象。具体而言,我们考虑一个包含$N$个分离训练数据点的二分类设置。$\textit{首先}$,基于存在大小为$\operatorname{poly}(D)$的干净分类器(其中$D$为数据维度)的假设,我们证明仅需$O(N D)$额外参数的ReLU网络即可通过鲁棒记忆实现CGRO,而鲁棒分类器在最坏情况下仍需要指数级表示复杂度。$\textit{其次}$,我们聚焦一类结构化数据案例分析训练动态,在该案例中训练一个宽度为$O(N D)$的两层卷积网络以抵御对抗扰动。随后证明学习过程中会出现三阶段相变,且网络可证明地收敛到鲁棒记忆机制,进而导致CGRO。$\textit{此外}$,我们还在真实图像识别数据集上通过实验验证了理论分析。