The integration of an ensemble of deep learning models has been extensively explored to enhance defense against adversarial attacks. The diversity among sub-models increases the attack cost required to deceive the majority of the ensemble, thereby improving the adversarial robustness. While existing approaches mainly center on increasing diversity in feature representations or dispersion of first-order gradients with respect to input, the limited correlation between these diversity metrics and adversarial robustness constrains the performance of ensemble adversarial defense. In this work, we aim to enhance ensemble diversity by reducing attack transferability. We identify second-order gradients, which depict the loss curvature, as a key factor in adversarial robustness. Computing the Hessian matrix involved in second-order gradients is computationally expensive. To address this, we approximate the Hessian-vector product using differential approximation. Given that low curvature provides better robustness, our ensemble model was designed to consider the influence of curvature among different sub-models. We introduce a novel regularizer to train multiple more-diverse low-curvature network models. Extensive experiments across various datasets demonstrate that our ensemble model exhibits superior robustness against a range of attacks, underscoring the effectiveness of our approach.
翻译:深度学习模型集成的整合已被广泛探索以增强对抗攻击的防御能力。子模型间的多样性增加了欺骗集成模型多数成员所需的攻击成本,从而提升了对抗鲁棒性。现有方法主要侧重于增加特征表示的多样性或输入一阶梯度的分散性,但这些多样性指标与对抗鲁棒性之间的有限相关性制约了集成对抗防御的性能。本研究旨在通过降低攻击可迁移性来增强集成多样性。我们识别出刻画损失曲率的二阶梯度是对抗鲁棒性的关键因素。计算二阶梯度涉及的Hessian矩阵在计算上成本高昂。为解决该问题,我们采用微分近似来逼近Hessian-向量乘积。鉴于低曲率能提供更好的鲁棒性,我们的集成模型设计考虑了不同子模型间曲率的影响。我们提出了一种新颖的正则化器来训练多个更具多样性的低曲率网络模型。在多个数据集上的广泛实验表明,我们的集成模型对各种攻击展现出卓越的鲁棒性,充分验证了所提方法的有效性。