The escalating threat of adversarial attacks on deep learning models, particularly in security-critical fields, has underscored the need for robust deep learning systems. Conventional robustness evaluations have relied on adversarial accuracy, which measures a model's performance under a specific perturbation intensity. However, this singular metric does not fully encapsulate the overall resilience of a model against varying degrees of perturbation. To address this gap, we propose a new metric termed adversarial hypervolume, assessing the robustness of deep learning models comprehensively over a range of perturbation intensities from a multi-objective optimization standpoint. This metric allows for an in-depth comparison of defense mechanisms and recognizes the trivial improvements in robustness afforded by less potent defensive strategies. Additionally, we adopt a novel training algorithm that enhances adversarial robustness uniformly across various perturbation intensities, in contrast to methods narrowly focused on optimizing adversarial accuracy. Our extensive empirical studies validate the effectiveness of the adversarial hypervolume metric, demonstrating its ability to reveal subtle differences in robustness that adversarial accuracy overlooks. This research contributes a new measure of robustness and establishes a standard for assessing and benchmarking the resilience of current and future defensive models against adversarial threats.
翻译:深度学习模型面临的对抗攻击威胁日益严峻,尤其是在安全关键领域,这凸显了构建鲁棒深度学习系统的紧迫性。传统鲁棒性评估依赖于对抗准确率,即在特定扰动强度下衡量模型性能。然而,这一单一指标无法全面反映模型面对不同扰动强度时的整体抗干扰能力。为弥补这一不足,我们从多目标优化角度出发,提出了一种名为“对抗超体积”的新指标,旨在综合评估深度学习模型在扰动强度范围内的鲁棒性。该指标支持对防御机制进行深入比较,并能识别出较弱的防御策略所带来微不足道的鲁棒性提升。此外,我们采用了一种新型训练算法,该算法能在多种扰动强度下均匀增强对抗鲁棒性,与那些狭隘专注于优化对抗准确率的方法形成对比。大量实证研究验证了对抗超体积指标的有效性,结果表明该指标能揭示对抗准确率所忽略的鲁棒性细微差异。本研究不仅提出了一种新的鲁棒性度量标准,还为评估和基准测试当前及未来防御模型对抗威胁的韧性建立了准则。