In this work, we leverage visual prompting (VP) to improve adversarial robustness of a fixed, pre-trained model at testing time. Compared to conventional adversarial defenses, VP allows us to design universal (i.e., data-agnostic) input prompting templates, which have plug-and-play capabilities at testing time to achieve desired model performance without introducing much computation overhead. Although VP has been successfully applied to improving model generalization, it remains elusive whether and how it can be used to defend against adversarial attacks. We investigate this problem and show that the vanilla VP approach is not effective in adversarial defense since a universal input prompt lacks the capacity for robust learning against sample-specific adversarial perturbations. To circumvent it, we propose a new VP method, termed Class-wise Adversarial Visual Prompting (C-AVP), to generate class-wise visual prompts so as to not only leverage the strengths of ensemble prompts but also optimize their interrelations to improve model robustness. Our experiments show that C-AVP outperforms the conventional VP method, with 2.1X standard accuracy gain and 2X robust accuracy gain. Compared to classical test-time defenses, C-AVP also yields a 42X inference time speedup.
翻译:在这项工作中,我们利用视觉提示(VP)在测试阶段提高固定预训练模型的对抗鲁棒性。与传统的对抗防御方法相比,VP使我们能够设计通用(即与数据无关)的输入提示模板,这些模板在测试阶段具有即插即用能力,可在不引入过多计算开销的情况下实现所需的模型性能。尽管VP已成功应用于提升模型泛化能力,但关于它能否以及如何用于抵御对抗攻击仍不明确。我们对此问题展开研究,发现原始VP方法在对抗防御中效果有限,因为通用输入提示缺乏针对样本特定对抗扰动进行鲁棒学习的能力。为克服这一局限,我们提出一种新的VP方法,称为类别级对抗视觉提示(C-AVP),通过生成类别级视觉提示,不仅充分利用了集成提示的优势,还能优化其内部关联以提升模型鲁棒性。实验结果表明,C-AVP优于传统VP方法,标准准确率提升2.1倍,鲁棒准确率提升2倍。与经典测试阶段防御方法相比,C-AVP还实现了42倍的推理时间加速。