Deep networks are well-known to be fragile to adversarial attacks, and adversarial training is one of the most popular methods used to train a robust model. To take advantage of unlabeled data, recent works have applied adversarial training to contrastive learning (Adversarial Contrastive Learning; ACL for short) and obtain promising robust performance. However, the theory of ACL is not well understood. To fill this gap, we leverage the Rademacher complexity to analyze the generalization performance of ACL, with a particular focus on linear models and multi-layer neural networks under $\ell_p$ attack ($p \ge 1$). Our theory shows that the average adversarial risk of the downstream tasks can be upper bounded by the adversarial unsupervised risk of the upstream task. The experimental results validate our theory.
翻译:深度网络已知易受对抗攻击影响,而对抗训练是训练鲁棒模型最常用的方法之一。为利用未标注数据,近期工作将对抗训练应用于对比学习(简称ACL),并获得了有前景的鲁棒性能。然而,ACL的理论基础尚不明确。为弥补这一空白,我们利用Rademacher复杂度分析ACL的泛化性能,特别关注线性模型和多层神经网络在$\ell_p$攻击($p \ge 1$)下的情形。理论表明,下游任务的平均对抗风险可由上游任务的对抗无监督风险上界控制。实验结果验证了我们的理论。