The notion of adversarial attacks on image classification models based on convolutional neural networks (CNN) is introduced in this work. To classify images, deep learning models called CNNs are frequently used. However, when the networks are subject to adversarial attacks, extremely potent and previously trained CNN models that perform quite effectively on image datasets for image classification tasks may perform poorly. In this work, one well-known adversarial attack known as the fast gradient sign method (FGSM) is explored and its adverse effects on the performances of image classification models are examined. The FGSM attack is simulated on three pre-trained image classifier CNN architectures, ResNet-101, AlexNet, and RegNetY 400MF using randomly chosen images from the ImageNet dataset. The classification accuracies of the models are computed in the absence and presence of the attack to demonstrate the detrimental effect of the attack on the performances of the classifiers. Finally, a mechanism is proposed to defend against the FGSM attack based on a modified defensive distillation-based approach. Extensive results are presented for the validation of the proposed scheme.
翻译:本文介绍了基于卷积神经网络(CNN)的图像分类模型所面临的对抗性攻击概念。卷积神经网络是常用于图像分类的深度学习模型。然而,当网络遭受对抗性攻击时,那些在图像数据集上对分类任务表现出色的预训练CNN模型,其性能可能会严重下降。本文探究了一种著名的对抗性攻击方法——快速梯度符号法(FGSM),并分析了其对图像分类模型性能的不利影响。我们采用ImageNet数据集中随机选取的图像,在三种预训练图像分类器CNN架构(ResNet-101、AlexNet和RegNetY 400MF)上模拟了FGSM攻击。通过计算模型在有无攻击情况下的分类准确率,展示了该攻击对分类器性能的破坏性影响。最后,我们提出了一种基于改进防御蒸馏机制的对策,用于抵御FGSM攻击。通过大量实验结果验证了所提方案的有效性。