This work studies sparse adversarial perturbations bounded by $l_0$ norm. We propose a white-box PGD-like attack method named sparse-PGD to effectively and efficiently generate such perturbations. Furthermore, we combine sparse-PGD with a black-box attack to comprehensively and more reliably evaluate the models' robustness against $l_0$ bounded adversarial perturbations. Moreover, the efficiency of sparse-PGD enables us to conduct adversarial training to build robust models against sparse perturbations. Extensive experiments demonstrate that our proposed attack algorithm exhibits strong performance in different scenarios. More importantly, compared with other robust models, our adversarially trained model demonstrates state-of-the-art robustness against various sparse attacks. Codes are available at https://github.com/CityU-MLO/sPGD.
翻译:本文研究受$l_0$范数约束的稀疏对抗扰动。我们提出了一种名为稀疏-PGD的白盒类PGD攻击方法,能够有效且高效地生成此类扰动。进一步地,我们将稀疏-PGD与黑盒攻击相结合,从而更全面、更可靠地评估模型对$l_0$有界对抗扰动的鲁棒性。此外,利用稀疏-PGD的高效性,我们能够执行对抗训练以构建针对稀疏扰动的鲁棒模型。大量实验表明,我们提出的攻击算法在不同场景下均表现出强大性能。更为重要的是,与现有鲁棒模型相比,经过对抗训练的模型在各种稀疏攻击下展现出最先进的鲁棒性。代码已开源在https://github.com/CityU-MLO/sPGD。