In this work, we leverage visual prompting (VP) to improve adversarial robustness of a fixed, pre-trained model at testing time. Compared to conventional adversarial defenses, VP allows us to design universal (i.e., data-agnostic) input prompting templates, which have plug-and-play capabilities at testing time to achieve desired model performance without introducing much computation overhead. Although VP has been successfully applied to improving model generalization, it remains elusive whether and how it can be used to defend against adversarial attacks. We investigate this problem and show that the vanilla VP approach is not effective in adversarial defense since a universal input prompt lacks the capacity for robust learning against sample-specific adversarial perturbations. To circumvent it, we propose a new VP method, termed Class-wise Adversarial Visual Prompting (C-AVP), to generate class-wise visual prompts so as to not only leverage the strengths of ensemble prompts but also optimize their interrelations to improve model robustness. Our experiments show that C-AVP outperforms the conventional VP method, with 2.1X standard accuracy gain and 2X robust accuracy gain. Compared to classical test-time defenses, C-AVP also yields a 42X inference time speedup.
翻译:在这项工作中,我们利用视觉提示(VP)来提升固定预训练模型在测试时的对抗鲁棒性。与传统对抗防御相比,VP使我们能够设计通用的(即与数据无关的)输入提示模板,这些模板在测试时具备即插即用能力,能在不增加过多计算开销的情况下实现期望的模型性能。尽管VP已成功应用于提升模型泛化能力,但它是否以及如何用于抵御对抗攻击仍不明确。我们针对这一问题展开研究,并发现传统VP方法在对抗防御中效果不佳,因为通用输入提示缺乏针对样本特异性对抗扰动进行鲁棒学习的能力。为解决此问题,我们提出一种新的VP方法,称为类别级对抗视觉提示(C-AVP),用于生成类别级视觉提示,从而既能发挥集成提示的优势,又能优化它们之间的相互关系以提升模型鲁棒性。实验表明,C-AVP优于传统VP方法,标准准确率提升2.1倍,鲁棒准确率提升2倍。与经典的测试时防御相比,C-AVP还实现了42倍的推理时间加速。