Powerful deep neural networks are vulnerable to adversarial attacks. To obtain adversarially robust models, researchers have separately developed adversarial training and Jacobian regularization techniques. There are abundant theoretical and empirical studies for adversarial training, but theoretical foundations for Jacobian regularization are still lacking. In this study, we show that Jacobian regularization is closely related to adversarial training in that $\ell_{2}$ or $\ell_{1}$ Jacobian regularized loss serves as an approximate upper bound on the adversarially robust loss under $\ell_{2}$ or $\ell_{\infty}$ adversarial attack respectively. Further, we establish the robust generalization gap for Jacobian regularized risk minimizer via bounding the Rademacher complexity of both the standard loss function class and Jacobian regularization function class. Our theoretical results indicate that the norms of Jacobian are related to both standard and robust generalization. We also perform experiments on MNIST data classification to demonstrate that Jacobian regularized risk minimization indeed serves as a surrogate for adversarially robust risk minimization, and that reducing the norms of Jacobian can improve both standard and robust generalization. This study promotes both theoretical and empirical understandings to adversarially robust generalization via Jacobian regularization.
翻译:强大的深度神经网络易受对抗攻击。为获得对抗鲁棒模型,研究者已分别开发了对抗训练与雅可比正则化技术。对抗训练已有丰富的理论与实证研究,但雅可比正则化的理论基础仍显不足。本研究表明,雅可比正则化与对抗训练密切相关:在$\ell_{2}$或$\ell_{\infty}$对抗攻击下,$\ell_{2}$或$\ell_{1}$雅可比正则化损失函数分别可作为对抗鲁棒损失的近似上界。进一步,我们通过界定标准损失函数类与雅可比正则化函数类的Rademacher复杂度,为雅可比正则化风险最小化器建立了鲁棒泛化间隙。理论结果表明雅可比矩阵的范数与标准泛化及鲁棒泛化均相关。我们在MNIST数据分类任务上进行了实验,证明雅可比正则化风险最小化确实可作为对抗鲁棒风险最小化的替代方法,且降低雅可比矩阵范数可同时提升标准泛化与鲁棒泛化性能。本研究通过雅可比正则化推进了对对抗鲁棒泛化的理论与实证理解。