Neural network models are vulnerable to adversarial examples, and adversarial transferability further increases the risk of adversarial attacks. Current methods based on transferability often rely on substitute models, which can be impractical and costly in real-world scenarios due to the unavailability of training data and the victim model's structural details. In this paper, we propose a novel approach that directly constructs adversarial examples by extracting transferable features across various tasks. Our key insight is that adversarial transferability can extend across different tasks. Specifically, we train a sequence-to-sequence generative model named CT-GAT using adversarial sample data collected from multiple tasks to acquire universal adversarial features and generate adversarial examples for different tasks. We conduct experiments on ten distinct datasets, and the results demonstrate that our method achieves superior attack performance with small cost.
翻译:神经网络模型易受对抗样本攻击,而对抗迁移性进一步加剧了对抗攻击的风险。当前基于迁移性的方法通常依赖替代模型,但在现实场景中,由于训练数据及受害者模型结构细节的不可获取性,这类方法往往不可行且成本高昂。本文提出了一种新方法,通过直接提取跨任务可迁移特征来构建对抗样本。我们的核心见解是:对抗迁移性可以跨不同任务进行扩展。具体而言,我们利用从多个任务收集的对抗样本数据训练了一个名为CT-GAT的序列到序列生成模型,以获取通用对抗特征并为不同任务生成对抗样本。我们在十个不同数据集上进行了实验,结果表明,我们的方法以较低成本实现了优越的攻击性能。