Adversarial attacks can readily disrupt the image classification system, revealing the vulnerability of DNN-based recognition tasks. While existing adversarial perturbations are primarily applied to uncompressed images or compressed images by the traditional image compression method, i.e., JPEG, limited studies have investigated the robustness of models for image classification in the context of DNN-based image compression. With the rapid evolution of advanced image compression, DNN-based learned image compression has emerged as the promising approach for transmitting images in many security-critical applications, such as cloud-based face recognition and autonomous driving, due to its superior performance over traditional compression. Therefore, there is a pressing need to fully investigate the robustness of a classification system post-processed by learned image compression. To bridge this research gap, we explore the adversarial attack on a new pipeline that targets image classification models that utilize learned image compressors as pre-processing modules. Furthermore, to enhance the transferability of perturbations across various quality levels and architectures of learned image compression models, we introduce a saliency score-based sampling method to enable the fast generation of transferable perturbation. Extensive experiments with popular attack methods demonstrate the enhanced transferability of our proposed method when attacking images that have been post-processed with different learned image compression models.
翻译:对抗攻击能够轻易破坏图像分类系统,揭示基于深度神经网络(DNN)的识别任务存在脆弱性。现有的对抗扰动主要应用于未压缩图像或经传统图像压缩方法(如JPEG)处理的图像,而针对基于DNN的图像压缩场景下图像分类模型鲁棒性的研究尚不充分。随着先进图像压缩技术的快速发展,基于DNN的"学习型图像压缩"因其性能优于传统压缩方法,已成为许多安全关键应用(如基于云的人脸识别和自动驾驶)中传输图像的有前景的方法。因此,迫切需要对经学习型图像压缩后处理的分类系统的鲁棒性进行全面研究。为填补这一研究空白,我们探索了一种针对新型处理流程的对抗攻击,该流程以采用学习型图像压缩器作为预处理模块的图像分类模型为目标。此外,为增强扰动在不同质量级别和架构的学习型图像压缩模型间的可迁移性,我们引入了一种基于显著性得分的采样方法,以实现可迁移扰动的快速生成。大量实验表明,在使用主流攻击方法时,我们提出的方法在攻击经不同学习型图像压缩模型后处理的图像时,显著提升了扰动的可迁移性。