Training Deep Neural Networks (DNNs) with adversarial examples often results in poor generalization to test-time adversarial data. This paper investigates this issue, known as adversarially robust generalization, through the lens of Rademacher complexity. Building upon the studies by Khim and Loh (2018); Yin et al. (2019), numerous works have been dedicated to this problem, yet achieving a satisfactory bound remains an elusive goal. Existing works on DNNs either apply to a surrogate loss instead of the robust loss or yield bounds that are notably looser compared to their standard counterparts. In the latter case, the bounds have a higher dependency on the width $m$ of the DNNs or the dimension $d$ of the data, with an extra factor of at least $\mathcal{O}(\sqrt{m})$ or $\mathcal{O}(\sqrt{d})$. This paper presents upper bounds for adversarial Rademacher complexity of DNNs that match the best-known upper bounds in standard settings, as established in the work of Bartlett et al. (2017), with the dependency on width and dimension being $\mathcal{O}(\ln(dm))$. The central challenge addressed is calculating the covering number of adversarial function classes. We aim to construct a new cover that possesses two properties: 1) compatibility with adversarial examples, and 2) precision comparable to covers used in standard settings. To this end, we introduce a new variant of covering number called the \emph{uniform covering number}, specifically designed and proven to reconcile these two properties. Consequently, our method effectively bridges the gap between Rademacher complexity in robust and standard generalization.
翻译:使用对抗样本训练深度神经网络(DNN)往往会导致模型在测试时的对抗数据上泛化能力较差。本文通过Rademacher复杂度的视角研究这一问题,即所谓的对抗鲁棒泛化问题。基于Khim和Loh(2018)、Yin等人(2019)的研究,大量工作致力于解决该问题,但获得令人满意的上界仍是一个难以企及的目标。现有关于DNN的研究要么仅适用于替代损失而非鲁棒损失,要么得到的上界相比标准情形明显更宽松。在后一种情况下,上界对DNN宽度$m$或数据维度$d$的依赖性更高,至少多出$\mathcal{O}(\sqrt{m})$或$\mathcal{O}(\sqrt{d})$的因子。本文提出了DNN对抗Rademacher复杂度的上界,该上界与Bartlett等人(2017)在标准设定下建立的最佳已知上界相匹配,对宽度和维度的依赖性为$\mathcal{O}(\ln(dm))$。解决的核心挑战是对抗函数类覆盖数的计算。我们旨在构建一种具备两个特性的新覆盖:1)与对抗样本兼容,2)精度可与标准设定中使用的覆盖相媲美。为此,我们引入了一种称为**均匀覆盖数**的覆盖数变体,专门设计并证明其能调和这两个特性。因此,我们的方法有效弥合了鲁棒泛化与标准泛化中Rademacher复杂度之间的差距。