Bridging the Gap: Rademacher Complexity in Robust and Standard Generalization

Training Deep Neural Networks (DNNs) with adversarial examples often results in poor generalization to test-time adversarial data. This paper investigates this issue, known as adversarially robust generalization, through the lens of Rademacher complexity. Building upon the studies by Khim and Loh (2018); Yin et al. (2019), numerous works have been dedicated to this problem, yet achieving a satisfactory bound remains an elusive goal. Existing works on DNNs either apply to a surrogate loss instead of the robust loss or yield bounds that are notably looser compared to their standard counterparts. In the latter case, the bounds have a higher dependency on the width $m$ of the DNNs or the dimension $d$ of the data, with an extra factor of at least $\mathcal{O}(\sqrt{m})$ or $\mathcal{O}(\sqrt{d})$. This paper presents upper bounds for adversarial Rademacher complexity of DNNs that match the best-known upper bounds in standard settings, as established in the work of Bartlett et al. (2017), with the dependency on width and dimension being $\mathcal{O}(\ln(dm))$. The central challenge addressed is calculating the covering number of adversarial function classes. We aim to construct a new cover that possesses two properties: 1) compatibility with adversarial examples, and 2) precision comparable to covers used in standard settings. To this end, we introduce a new variant of covering number called the \emph{uniform covering number}, specifically designed and proven to reconcile these two properties. Consequently, our method effectively bridges the gap between Rademacher complexity in robust and standard generalization.

翻译：使用对抗样本训练深度神经网络（DNN）往往会导致模型在测试时的对抗数据上泛化能力较差。本文通过Rademacher复杂度的视角研究这一问题，即所谓的对抗鲁棒泛化问题。基于Khim和Loh（2018）、Yin等人（2019）的研究，大量工作致力于解决该问题，但获得令人满意的上界仍是一个难以企及的目标。现有关于DNN的研究要么仅适用于替代损失而非鲁棒损失，要么得到的上界相比标准情形明显更宽松。在后一种情况下，上界对DNN宽度$m$或数据维度$d$的依赖性更高，至少多出$\mathcal{O}(\sqrt{m})$或$\mathcal{O}(\sqrt{d})$的因子。本文提出了DNN对抗Rademacher复杂度的上界，该上界与Bartlett等人（2017）在标准设定下建立的最佳已知上界相匹配，对宽度和维度的依赖性为$\mathcal{O}(\ln(dm))$。解决的核心挑战是对抗函数类覆盖数的计算。我们旨在构建一种具备两个特性的新覆盖：1）与对抗样本兼容，2）精度可与标准设定中使用的覆盖相媲美。为此，我们引入了一种称为**均匀覆盖数**的覆盖数变体，专门设计并证明其能调和这两个特性。因此，我们的方法有效弥合了鲁棒泛化与标准泛化中Rademacher复杂度之间的差距。