Deep neural networks (DNNs) are vulnerable to backdoor attacks. The backdoor adversaries intend to maliciously control the predictions of attacked DNNs by injecting hidden backdoors that can be activated by adversary-specified trigger patterns during the training process. One recent research revealed that most of the existing attacks failed in the real physical world since the trigger contained in the digitized test samples may be different from that of the one used for training. Accordingly, users can adopt spatial transformations as the image pre-processing to deactivate hidden backdoors. In this paper, we explore the previous findings from another side. We exploit classical spatial transformations (i.e. rotation and translation) with the specific parameter as trigger patterns to design a simple yet effective poisoning-based backdoor attack. For example, only images rotated to a particular angle can activate the embedded backdoor of attacked DNNs. Extensive experiments are conducted, verifying the effectiveness of our attack under both digital and physical settings and its resistance to existing backdoor defenses.
翻译:深度神经网络(DNNs)易受后门攻击。后门攻击者意图通过在训练过程中注入由攻击者指定触发模式激活的隐藏后门,恶意控制被攻击DNN的预测结果。近期一项研究表明,由于数字化测试样本中包含的触发器可能与训练时使用的触发器不同,现有大多数攻击在真实物理世界中均告失败。因此,用户可以采用空间变换作为图像预处理手段来解活隐藏后门。本文从另一视角探索先前发现,利用特定参数下的经典空间变换(即旋转与平移)作为触发模式,设计了一种简单高效的基于投毒的后门攻击。例如,仅旋转至特定角度的图像才能激活被攻击DNN的嵌入后门。大量实验验证了该攻击在数字与物理场景下的有效性及其对现有后门防御的抵抗能力。