Deep learning models, while achieving state-of-the-art performance on many tasks, are susceptible to adversarial attacks that exploit inherent vulnerabilities in their architectures. Adversarial attacks manipulate the input data with imperceptible perturbations, causing the model to misclassify the data or produce erroneous outputs. This work is based on enhancing the robustness of targeted classifier models against adversarial attacks. To achieve this, an convolutional autoencoder-based approach is employed that effectively counters adversarial perturbations introduced to the input images. By generating images closely resembling the input images, the proposed methodology aims to restore the model's accuracy.
翻译:深度学习模型虽然在许多任务上取得了最先进的性能,但其架构中固有的漏洞使其容易受到对抗攻击。对抗攻击通过对输入数据施加难以察觉的扰动,导致模型对数据分类错误或产生错误输出。本文旨在增强目标分类器模型对抗攻击的鲁棒性。为此,采用了一种基于卷积自编码器的方法,有效抵消了输入图像中引入的对抗扰动。通过生成与输入图像高度相似的图像,所提出的方法旨在恢复模型的准确性。