Enhancing Transferability of Targeted Adversarial Examples: A Self-Universal Perspective

Transfer-based targeted adversarial attacks against black-box deep neural networks (DNNs) have been proven to be significantly more challenging than untargeted ones. The impressive transferability of current SOTA, the generative methods, comes at the cost of requiring massive amounts of additional data and time-consuming training for each targeted label. This results in limited efficiency and flexibility, significantly hindering their deployment in practical applications. In this paper, we offer a self-universal perspective that unveils the great yet underexplored potential of input transformations in pursuing this goal. Specifically, transformations universalize gradient-based attacks with intrinsic but overlooked semantics inherent within individual images, exhibiting similar scalability and comparable results to time-consuming learning over massive additional data from diverse classes. We also contribute a surprising empirical insight that one of the most fundamental transformations, simple image scaling, is highly effective, scalable, sufficient, and necessary in enhancing targeted transferability. We further augment simple scaling with orthogonal transformations and block-wise applicability, resulting in the Simple, faSt, Self-universal yet Strong Scale Transformation (S$^4$ST) for self-universal TTA. On the ImageNet-Compatible benchmark dataset, our method achieves a 19.8% improvement in the average targeted transfer success rate against various challenging victim models over existing SOTA transformation methods while only consuming 36% time for attacking. It also outperforms resource-intensive attacks by a large margin in various challenging settings.

翻译：针对黑盒深度神经网络（DNN）的基于迁移的目标对抗攻击已被证明比非目标攻击更具挑战性。当前最先进的生成式方法虽然具有令人印象深刻的迁移性，但其代价是需要为每个目标标签使用海量额外数据并进行耗时的训练。这导致了效率和灵活性受限，严重阻碍了其在实践中的应用。本文提出一种自通用视角，揭示了输入变换在实现此目标方面巨大但尚未被充分探索的潜力。具体而言，变换通过利用单个图像内部固有的、但被忽视的内在语义，使基于梯度的攻击通用化，展现出与在不同类别海量额外数据上进行耗时学习相媲美的可扩展性和可比结果。我们还贡献了一个令人惊讶的经验性见解：最基本的变换之一——简单的图像缩放，在提升目标迁移性方面高度有效、可扩展、充分且必要。我们进一步通过正交变换和分块适用性增强了简单缩放，从而形成了用于自通用目标迁移攻击的简单、快速、自通用且强大的尺度变换（S$^4$ST）。在ImageNet-Compatible基准数据集上，我们的方法在攻击各种具有挑战性的受害者模型时，其平均目标迁移成功率比现有最先进的变换方法提升了19.8%，而攻击时间仅消耗后者的36%。在各种具有挑战性的设置下，它也大幅优于资源密集型攻击方法。