Deep neural networks (DNNs) have achieved tremendous success in many remote sensing (RS) applications. However, their vulnerability to the threat of adversarial perturbations should not be neglected. Unfortunately, current adversarial defense approaches in RS studies usually suffer from performance fluctuation and unnecessary re-training costs due to the need for prior knowledge of the adversarial perturbations among RS data. To circumvent these challenges, we propose a universal adversarial defense approach in RS imagery (UAD-RS) using pre-trained diffusion models to defend the common DNNs against multiple unknown adversarial attacks. Specifically, the generative diffusion models are first pre-trained on different RS datasets to learn generalized representations in various data domains. After that, a universal adversarial purification framework is developed using the forward and reverse process of the pre-trained diffusion models to purify the perturbations from adversarial samples. Furthermore, an adaptive noise level selection (ANLS) mechanism is built to capture the optimal noise level of the diffusion model that can achieve the best purification results closest to the clean samples according to their Frechet Inception Distance (FID) in deep feature space. As a result, only a single pre-trained diffusion model is needed for the universal purification of adversarial samples on each dataset, which significantly alleviates the re-training efforts for each attack setting and maintains high performance without the prior knowledge of adversarial perturbations. Experiments on four heterogeneous RS datasets regarding scene classification and semantic segmentation verify that UAD-RS outperforms state-of-the-art adversarial purification approaches with a universal defense against seven commonly existing adversarial perturbations.
翻译:深度神经网络(DNNs)已在众多遥感应用中取得巨大成功,但其对对抗扰动的威胁脆弱性不容忽视。然而,当前遥感研究中的对抗防御方法常因需要遥感数据中对抗扰动的先验知识,而面临性能波动和不必要的再训练成本。为规避这些挑战,我们提出一种基于预训练扩散模型的遥感图像通用对抗防御方法(UAD-RS),以保护常见DNNs免受多种未知对抗攻击。具体而言,首先在不同遥感数据集上预训练生成式扩散模型,以学习各数据域中的广义表征。随后,利用预训练扩散模型的前向与反向过程构建通用对抗净化框架,以清除对抗样本中的扰动。此外,建立自适应噪声水平选择(ANLS)机制,根据深度特征空间中的弗雷歇初始距离(FID)捕捉扩散模型的最佳噪声水平,从而实现最接近干净样本的最优净化效果。最终,每个数据集仅需单一预训练扩散模型即可实现对抗样本的通用净化,显著减轻各攻击设置下的再训练负担,且无需对抗扰动的先验知识即可保持高性能。在四项异质性遥感数据集上的场景分类与语义分割实验验证,UAD-RS在针对七种常见对抗扰动的通用防御中优于现有最先进的对抗净化方法。