Adversarial attacks can readily disrupt the image classification system, revealing the vulnerability of DNN-based recognition tasks. While existing adversarial perturbations are primarily applied to uncompressed images or compressed images by the traditional image compression method, i.e., JPEG, limited studies have investigated the robustness of models for image classification in the context of DNN-based image compression. With the rapid evolution of advanced image compression, DNN-based learned image compression has emerged as the promising approach for transmitting images in many security-critical applications, such as cloud-based face recognition and autonomous driving, due to its superior performance over traditional compression. Therefore, there is a pressing need to fully investigate the robustness of a classification system post-processed by learned image compression. To bridge this research gap, we explore the adversarial attack on a new pipeline that targets image classification models that utilize learned image compressors as pre-processing modules. Furthermore, to enhance the transferability of perturbations across various quality levels and architectures of learned image compression models, we introduce a saliency score-based sampling method to enable the fast generation of transferable perturbation. Extensive experiments with popular attack methods demonstrate the enhanced transferability of our proposed method when attacking images that have been post-processed with different learned image compression models.
翻译:对抗攻击可轻易破坏图像分类系统,揭示基于深度神经网络(DNN)识别任务的脆弱性。现有对抗扰动主要应用于未经压缩的图像,或经传统图像压缩方法(如JPEG)处理后的图像,但针对DNN图像压缩场景下图像分类模型鲁棒性的研究尚属有限。随着先进图像压缩技术的快速发展,基于深度学习的可学习图像压缩因其优于传统压缩的性能,已广泛应用于云端人脸识别、自动驾驶等安全关键型应用的图像传输场景。因此,迫切需要全面探究经可学习图像压缩后处理分类系统的鲁棒性。为填补这一研究空白,我们探索了一种新型攻击范式,针对以可学习图像压缩器为预处理模块的图像分类模型实施对抗攻击。此外,为增强扰动在不同质量等级与架构的可学习图像压缩模型间的可迁移性,我们提出了一种基于显著性分数采样的方法,实现可迁移扰动的快速生成。采用主流攻击方法的大量实验表明,当攻击经不同可学习图像压缩模型后处理的图像时,我们所提方法具有显著增强的可迁移性。