Adversarial robustness is a research area that has recently received a lot of attention in the quest for trustworthy artificial intelligence. However, recent works on adversarial robustness have focused on supervised learning where it is assumed that labeled data is plentiful. In this paper, we investigate semi-supervised adversarial training where labeled data is scarce. We derive two upper bounds for the robust risk and propose a regularization term for unlabeled data motivated by these two upper bounds. Then, we develop a semi-supervised adversarial training algorithm that combines the proposed regularization term with knowledge distillation using a semi-supervised teacher (i.e., a teacher model trained using a semi-supervised learning algorithm). Our experiments show that our proposed algorithm achieves state-of-the-art performance with significant margins compared to existing algorithms. In particular, compared to supervised learning algorithms, performance of our proposed algorithm is not much worse even when the amount of labeled data is very small. For example, our algorithm with only 8\% labeled data is comparable to supervised adversarial training algorithms that use all labeled data, both in terms of standard and robust accuracies on CIFAR-10.
翻译:对抗鲁棒性是近期在可信人工智能研究中备受关注的领域。然而,现有对抗鲁棒性研究主要聚焦于假设标注数据充足的监督学习场景。本文针对标注数据稀缺的半监督对抗训练问题展开研究。我们推导出鲁棒风险的两个上界,并基于这两个上界提出了面向未标注数据的正则化项。随后,我们开发了一种半监督对抗训练算法,该算法将所提正则化项与基于半监督教师模型(即通过半监督学习算法训练的教师模型)的知识蒸馏相结合。实验表明,与现有算法相比,所提算法以显著优势达到了最优性能。值得注意的是,与监督学习算法相比,即使在标注数据量极小时,我们的算法性能仍无明显下降。例如,在CIFAR-10数据集上,仅使用8%标注数据的算法,其标准精度和鲁棒精度均与使用全部标注数据的监督对抗训练算法相当。