Adversarial examples, crafted by adding perturbations imperceptible to humans, can deceive neural networks. Recent studies identify the adversarial transferability across various models, \textit{i.e.}, the cross-model attack ability of adversarial samples. To enhance such adversarial transferability, existing input transformation-based methods diversify input data with transformation augmentation. However, their effectiveness is limited by the finite number of available transformations. In our study, we introduce a novel approach named Learning to Transform (L2T). L2T increases the diversity of transformed images by selecting the optimal combination of operations from a pool of candidates, consequently improving adversarial transferability. We conceptualize the selection of optimal transformation combinations as a trajectory optimization problem and employ a reinforcement learning strategy to effectively solve the problem. Comprehensive experiments on the ImageNet dataset, as well as practical tests with Google Vision and GPT-4V, reveal that L2T surpasses current methodologies in enhancing adversarial transferability, thereby confirming its effectiveness and practical significance. The code is available at https://github.com/RongyiZhu/L2T.
翻译:对抗样本通过添加人类难以察觉的扰动生成,能够欺骗神经网络。近期研究揭示了对抗样本在不同模型间的可迁移性,即其跨模型攻击能力。为增强此类对抗迁移性,现有基于输入变换的方法通过变换增强来多样化输入数据。然而,这些方法的有效性受限于可用变换的有限数量。在本研究中,我们提出了一种名为"学习变换"(L2T)的新方法。L2T通过从候选操作池中选择最优操作组合来增加变换图像的多样性,从而提升对抗迁移性。我们将最优变换组合的选择问题构建为轨迹优化问题,并采用强化学习策略有效求解该问题。在ImageNet数据集上的综合实验,以及对Google Vision和GPT-4V的实际测试表明,L2T在提升对抗迁移性方面优于现有方法,从而证实了其有效性与实际意义。代码发布于https://github.com/RongyiZhu/L2T。