Existing ViT backdoor attacks based on backbone-overwriting full-tuning are computationally expensive and inflict performance degradation. This has forced adversaries towards the Visual Parameter-Efficient Fine-Tuning (PEFT) paradigm, dominated by adapter-based (e.g., LoRA) and prompt-based (e.g., VPT) approaches. While adapter security has seen initial study, the risks of the burgeoning prompt-based ecosystem remain critically unexplored. We fill this critical gap, exposing how the evolution of VPT towards dynamic and context-aware architectures can facilitate a far more dangerous and emergent threat. This vulnerability arises even though these dynamic modules unlock superior benign performance. We propose VIPER, an attack framework built on a lightweight, dynamic Visual Prompt Generator (VPG) that demonstrates this vulnerability. Critically, this dynamic architecture enables Functional Fusion: an emergent phenomenon where malicious logic and benign task utility are tightly fused into the same sparse, high-magnitude parameter core. This fusion creates a formidable ``hostage" dilemma, as pruning the attack necessarily destroys the benign performance. Comprehensive evaluations show VIPER effectively addresses the attacker's trilemma: VIPER not only achieves state-of-the-art performance on clean data, but also maintains near-100% ASR even under 90% VPG-module pruning (where LoRA attacks collapse), while adding only an imperceptible 0.06ms (1.16%) of inference latency. VIPER's results, driven by Functional Fusion, expose a new, paradigm-level risk in dynamic prompt architectures.
翻译:现有的基于主干覆盖全微调的视觉Transformer后门攻击计算成本高昂且导致性能退化,迫使攻击者转向视觉参数高效微调范式,其中适配器方法(如LoRA)和提示方法(如VPT)占主导地位。尽管适配器安全性已有初步研究,但快速发展的提示生态系统所蕴含的风险仍缺乏关键性探索。我们填补了这一重要空白,揭示了VPT向动态和上下文感知架构的演进如何催生更危险的突发威胁——即便这些动态模块能解锁优异的良性性能,该漏洞依然存在。我们提出VIPER攻击框架,该框架基于轻量级动态视觉提示生成器来演示此漏洞。关键在于,这种动态架构实现了"功能融合":一种恶意逻辑与良性任务效用被紧密融合至同一稀疏、高幅值参数核的突发现象。这种融合形成了棘手的"人质"困境——剪除攻击必然破坏良性性能。全面评估表明VIPER有效应对了攻击者的三重困境:不仅在干净数据上达到最先进性能,还能在VPG模块剪枝率达90%(LoRA攻击在此条件下崩溃)时保持近100%的攻击成功率,同时仅增加0.06毫秒(1.16%)的不可感知推理延迟。由功能融合驱动的VIPER实验结果,揭示了动态提示架构中范式层面的新型风险。