The performance of deep models, including Vision Transformers, is known to be vulnerable to adversarial attacks. Many existing defenses against these attacks, such as adversarial training, rely on full-model fine-tuning to induce robustness in the models. These defenses require storing a copy of the entire model, that can have billions of parameters, for each task. At the same time, parameter-efficient prompt tuning is used to adapt large transformer-based models to downstream tasks without the need to save large copies. In this paper, we examine parameter-efficient prompt tuning of Vision Transformers for downstream tasks under the lens of robustness. We show that previous adversarial defense methods, when applied to the prompt tuning paradigm, suffer from gradient obfuscation and are vulnerable to adaptive attacks. We introduce ADAPT, a novel framework for performing adaptive adversarial training in the prompt tuning paradigm. Our method achieves competitive robust accuracy of ~40% w.r.t. SOTA robustness methods using full-model fine-tuning, by tuning only ~1% of the number of parameters.
翻译:深度模型(包括视觉变换器)的性能已知易受对抗攻击影响。现有防御措施(如对抗训练)通常依赖全模型微调来增强模型鲁棒性,这些方法需要为每个任务存储整个模型(可能包含数十亿参数)的副本。与此同时,参数高效的提示调整技术被用于将大型基于变换器的模型适配到下游任务,而无需保存大量副本。本文从鲁棒性视角出发,研究面向下游任务的视觉变换器参数高效提示调整。我们证明,先前的对抗防御方法应用于提示调整范式时,存在梯度混淆问题,且易受自适应攻击。我们提出ADAPT——一种在提示调整范式中执行自适应对抗训练的新型框架。该方法仅调整约1%的参数量,即可在全模型微调的顶级鲁棒性方法下实现约40%的竞争性鲁棒准确率。