Transferable targeted adversarial attacks aim to mislead models into outputting adversary-specified predictions in black-box scenarios. Recent studies have introduced \textit{single-target} generative attacks that train a generator for each target class to generate highly transferable perturbations, resulting in substantial computational overhead when handling multiple classes. \textit{Multi-target} attacks address this by training only one class-conditional generator for multiple classes. However, the generator simply uses class labels as conditions, failing to leverage the rich semantic information of the target class. To this end, we design a \textbf{C}LIP-guided \textbf{G}enerative \textbf{N}etwork with \textbf{C}ross-attention modules (CGNC) to enhance multi-target attacks by incorporating textual knowledge of CLIP into the generator. Extensive experiments demonstrate that CGNC yields significant improvements over previous multi-target generative attacks, e.g., a 21.46\% improvement in success rate from ResNet-152 to DenseNet-121. Moreover, we propose a masked fine-tuning mechanism to further strengthen our method in attacking a single class, which surpasses existing single-target methods.
翻译:可迁移定向对抗攻击旨在黑盒场景下误导模型输出攻击者指定的预测结果。近期研究提出了\textit{单目标}生成式攻击方法,通过为每个目标类别训练独立的扰动生成器来产生高迁移性扰动,但在处理多类别时会产生巨大的计算开销。\textit{多目标}攻击方法通过仅训练一个类别条件生成器来应对多个类别,从而解决此问题。然而,现有生成器仅使用类别标签作为条件,未能充分利用目标类别的丰富语义信息。为此,我们设计了一种带交叉注意力模块的\textbf{C}LIP引导\textbf{G}生成\textbf{N}网络(CGNC),通过将CLIP的文本知识融入生成器来增强多目标攻击性能。大量实验表明,CGNC相比以往多目标生成式攻击取得了显著提升,例如在从ResNet-152到DenseNet-121的攻击中成功率提升了21.46\%。此外,我们提出了一种掩码微调机制,进一步强化了方法在攻击单一类别时的性能,其效果超越了现有单目标攻击方法。