Evaluating Adversarial Robustness: A Comparison Of FGSM, Carlini-Wagner Attacks, And The Role of Distillation as Defense Mechanism

from arxiv, This report pertains to the Capstone Project done by Group 1 of the Fall batch of 2023 students at Praxis Tech School, Kolkata, India. The reports consists of 35 pages and it includes 15 figures and 10 tables. This is the preprint which will be submitted to to an IEEE international conference for review

This technical report delves into an in-depth exploration of adversarial attacks specifically targeted at Deep Neural Networks (DNNs) utilized for image classification. The study also investigates defense mechanisms aimed at bolstering the robustness of machine learning models. The research focuses on comprehending the ramifications of two prominent attack methodologies: the Fast Gradient Sign Method (FGSM) and the Carlini-Wagner (CW) approach. These attacks are examined concerning three pre-trained image classifiers: Resnext50_32x4d, DenseNet-201, and VGG-19, utilizing the Tiny-ImageNet dataset. Furthermore, the study proposes the robustness of defensive distillation as a defense mechanism to counter FGSM and CW attacks. This defense mechanism is evaluated using the CIFAR-10 dataset, where CNN models, specifically resnet101 and Resnext50_32x4d, serve as the teacher and student models, respectively. The proposed defensive distillation model exhibits effectiveness in thwarting attacks such as FGSM. However, it is noted to remain susceptible to more sophisticated techniques like the CW attack. The document presents a meticulous validation of the proposed scheme. It provides detailed and comprehensive results, elucidating the efficacy and limitations of the defense mechanisms employed. Through rigorous experimentation and analysis, the study offers insights into the dynamics of adversarial attacks on DNNs, as well as the effectiveness of defensive strategies in mitigating their impact.

翻译：本技术报告深入探讨了针对图像分类深度神经网络（DNN）的对抗性攻击，并系统研究了旨在增强机器学习模型鲁棒性的防御机制。研究聚焦于两种主流攻击方法的效应解析：快速梯度符号法（FGSM）与Carlini-Wagner（CW）方法。基于Tiny-ImageNet数据集，实验选用三种预训练图像分类器（Resnext50_32x4d、DenseNet-201及VGG-19）进行攻击测试。此外，提出采用防御性蒸馏作为抵御FGSM和CW攻击的鲁棒性防御机制，并在CIFAR-10数据集上评估其效果。该机制以resnet101和Resnext50_32x4d分别作为教师模型与学生模型的CNN架构进行实现。实验表明，所提出的防御性蒸馏模型虽能有效抵御FGSM攻击，但面对CW等高级攻击技术时仍存在脆弱性。文档对方案进行了严谨验证，通过详实全面的实验结果揭示了所采用防御机制的有效性与局限性。通过系统实验与分析，本研究为DNN对抗性攻击的动力学机制及防御策略的缓解效果提供了重要见解。