While convolutional neural networks (CNNs) have achieved success in computer vision tasks, it is vulnerable to backdoor attacks. Such attacks could mislead the victim model to make attacker-chosen prediction with a specific trigger pattern. Until now, the trigger injection of existing attacks is mainly limited to spatial domain. Recent works take advantage of perceptual properties of planting specific patterns in the frequency domain, which only reflect indistinguishable pixel-wise perturbations in pixel domain. However, in the black-box setup, the inaccessibility of training process often renders more complex trigger designs. Existing frequency attacks simply handcraft the magnitude of spectrum, introducing anomaly frequency disparities between clean and poisoned data and taking risks of being removed by image processing operations (such as lossy compression and filtering). In this paper, we propose a robust low-frequency black-box backdoor attack (LFBA), which minimally perturbs low-frequency components of frequency spectrum and maintains the perceptual similarity in spatial space simultaneously. The key insight of our attack restrict the search for the optimal trigger to low-frequency region that can achieve high attack effectiveness, robustness against image transformation defenses and stealthiness in dual space. We utilize simulated annealing (SA), a form of evolutionary algorithm, to optimize the properties of frequency trigger including the number of manipulated frequency bands and the perturbation of each frequency component, without relying on the knowledge from the victim classifier. Extensive experiments on real-world datasets verify the effectiveness and robustness of LFBA against image processing operations and the state-of-the-art backdoor defenses, as well as its inherent stealthiness in both spatial and frequency space, making it resilient against frequency inspection.
翻译:尽管卷积神经网络在计算机视觉任务中取得了成功,但其容易受到后门攻击的威胁。这类攻击能通过特定触发器模式误导被攻击模型做出攻击者预设的预测。现有攻击的触发器注入主要局限于空间域。近期研究利用频域中植入特定模式的感知特性,在像素域仅呈现难以感知的像素级扰动。然而在黑盒设置下,训练过程不可访问性使得设计复杂触发器更为困难。现有频域攻击仅通过手工设计频谱幅度,导致干净数据与中毒数据之间存在异常频率差异,并面临被图像处理操作(如有损压缩和滤波)消除的风险。本文提出鲁棒的低频黑盒后门攻击(LFBA),该方法在最小扰动频谱低频分量的同时保持空间域感知相似性。其核心思想是将最优触发器搜索限定在低频区域,以实现高攻击效能、抗图像变换防御鲁棒性及双空间隐蔽性。我们采用模拟退火算法(一种进化算法)优化频域触发器属性,包括操控频带数量及各频率分量扰动强度,无需依赖受害者分类器的先验知识。在真实数据集上的大量实验验证了LFBA对图像处理操作及先进后门防御的有效性与鲁棒性,其在空间域和频域均具备固有隐蔽性,可有效抵御频域检测。