This paper presents GReAT (Graph Regularized Adversarial Training), a novel regularization method designed to enhance the robust classification performance of deep learning models. Adversarial examples, characterized by subtle perturbations that can mislead models, pose a significant challenge in machine learning. Although adversarial training is effective in defending against such attacks, it often overlooks the underlying data structure. In response, GReAT integrates graph based regularization into the adversarial training process, leveraging the data's inherent structure to enhance model robustness. By incorporating graph information during training, GReAT defends against adversarial attacks and improves generalization to unseen data. Extensive evaluations on benchmark datasets demonstrate that GReAT outperforms state of the art methods in robustness, achieving notable improvements in classification accuracy. Specifically, compared to the second best methods, GReAT achieves a performance increase of approximately 4.87% for CIFAR10 against FGSM attack and 10.57% for SVHN against FGSM attack. Additionally, for CIFAR10, GReAT demonstrates a performance increase of approximately 11.05% against PGD attack, and for SVHN, a 5.54% increase against PGD attack. This paper provides detailed insights into the proposed methodology, including numerical results and comparisons with existing approaches, highlighting the significant impact of GReAT in advancing the performance of deep learning models.
翻译:本文提出GReAT(图正则化对抗训练),一种旨在增强深度学习模型鲁棒分类性能的新型正则化方法。对抗样本(通过微小扰动误导模型的样本)对机器学习构成重大挑战。尽管对抗训练能有效防御此类攻击,但常忽略底层数据结构。为此,GReAT将基于图的正则化整合到对抗训练过程中,利用数据的固有结构增强模型鲁棒性。通过在训练中融入图信息,GReAT既能抵御对抗攻击,又能提升对未见数据的泛化能力。在基准数据集上的广泛评估表明,GReAT在鲁棒性方面优于现有最优方法,分类准确率显著提升。具体而言,与次优方法相比,针对CIFAR10数据集上的FGSM攻击,GReAT性能提升约4.87%;针对SVHN数据集上的FGSM攻击,提升约10.57%。此外,针对CIFAR10数据集上的PGD攻击,GReAT性能提升约11.05%;针对SVHN数据集上的PGD攻击,提升约5.54%。本文提供了所提方法的详细分析,包括数值结果及与现有方法的对比,凸显了GReAT在推动深度学习模型性能方面的显著影响。