The adversarial vulnerability of deep neural networks (DNNs) has drawn great attention due to the security risk of applying these models in real-world applications. Based on transferability of adversarial examples, an increasing number of transfer-based methods have been developed to fool black-box DNN models whose architecture and parameters are inaccessible. Although tremendous effort has been exerted, there still lacks a standardized benchmark that could be taken advantage of to compare these methods systematically, fairly, and practically. Our investigation shows that the evaluation of some methods needs to be more reasonable and more thorough to verify their effectiveness, to avoid, for example, unfair comparison and insufficient consideration of possible substitute/victim models. Therefore, we establish a transfer-based attack benchmark (TA-Bench) which implements 30+ methods. In this paper, we evaluate and compare them comprehensively on 25 popular substitute/victim models on ImageNet. New insights about the effectiveness of these methods are gained and guidelines for future evaluations are provided. Code at: https://github.com/qizhangli/TA-Bench.
翻译:深度神经网络(DNNs)的对抗脆弱性因其在实际应用中的安全风险而备受关注。基于对抗样本的迁移性,越来越多的基于迁移的方法被开发出来,用于欺骗架构和参数不可知的黑盒DNN模型。尽管已有大量研究投入,但目前仍缺乏一个标准化基准来系统、公平且实用地对这些方法进行比较。我们的研究表明,部分方法的评估在合理性及全面性上仍有待提升,例如存在不公平比较、对替代/受害者模型考虑不充分等问题。为此,我们构建了一个基于迁移的攻击基准(TA-Bench),实现了30余种方法。本文在ImageNet上25个流行的替代/受害者模型上对这些方法进行了全面评估与比较,获得了关于其有效性的新见解,并为未来评估提供了指导。代码开源地址:https://github.com/qizhangli/TA-Bench。