Deep neural networks (DNNs) are susceptible to backdoor attacks, where malicious functionality is embedded to allow attackers to trigger incorrect classifications. Old-school backdoor attacks use strong trigger features that can easily be learned by victim models. Despite robustness against input variation, the robustness however increases the likelihood of unintentional trigger activations. This leaves traces to existing defenses, which find approximate replacements for the original triggers that can activate the backdoor without being identical to the original trigger via, e.g., reverse engineering and sample overlay. In this paper, we propose and investigate a new characteristic of backdoor attacks, namely, backdoor exclusivity, which measures the ability of backdoor triggers to remain effective in the presence of input variation. Building upon the concept of backdoor exclusivity, we propose Backdoor Exclusivity LifTing (BELT), a novel technique which suppresses the association between the backdoor and fuzzy triggers to enhance backdoor exclusivity for defense evasion. Extensive evaluation on three popular backdoor benchmarks validate, our approach substantially enhances the stealthiness of four old-school backdoor attacks, which, after backdoor exclusivity lifting, is able to evade seven state-of-the-art backdoor countermeasures, at almost no cost of the attack success rate and normal utility. For example, one of the earliest backdoor attacks BadNet, enhanced by BELT, evades most of the state-of-the-art defenses including ABS and MOTH which would otherwise recognize the backdoored model.
翻译:深度神经网络(DNN)易受后门攻击,其通过嵌入恶意功能使攻击者能够触发错误分类。老式后门攻击使用强触发特征,这些特征易于被受害模型学习。尽管对输入变化具有鲁棒性,但这种鲁棒性反而增加了无意触发激活的可能性,从而给现有防御留下痕迹——这些防御可通过逆向工程和样本覆盖等方式找到原始触发器的近似替代物(无需与原始触发器完全相同即可激活后门)。本文提出并研究后门攻击的新特性——后门排他性,该特性衡量后门触发器在输入变化下保持有效性的能力。基于后门排他性概念,我们提出后门排他性提升(BELT)技术,通过抑制后门与模糊触发器之间的关联来增强后门排他性以实现防御规避。在三个主流后门基准上的广泛评估表明,我们的方法显著增强了四种老式后门攻击的隐蔽性,经后门排他性提升后,能够规避七种最先进的后门防御手段,且几乎不损失攻击成功率和正常效用。例如,最早的后门攻击BadNet经BELT增强后,可规避包括ABS和MOTH在内的多数最先进防御——而这些防御原本能识别出后门模型。