Backdoor attack aims to compromise a model, which returns an adversary-wanted output when a specific trigger pattern appears yet behaves normally for clean inputs. Current backdoor attacks require changing pixels of clean images, which results in poor stealthiness of attacks and increases the difficulty of the physical implementation. This paper proposes a novel physical invisible backdoor based on camera imaging without changing nature image pixels. Specifically, a compromised model returns a target label for images taken by a particular camera, while it returns correct results for other images. To implement and evaluate the proposed backdoor, we take shots of different objects from multi-angles using multiple smartphones to build a new dataset of 21,500 images. Conventional backdoor attacks work ineffectively with some classical models, such as ResNet18, over the above-mentioned dataset. Therefore, we propose a three-step training strategy to mount the backdoor attack. First, we design and train a camera identification model with the phone IDs to extract the camera fingerprint feature. Subsequently, we elaborate a special network architecture, which is easily compromised by our backdoor attack, by leveraging the attributes of the CFA interpolation algorithm and combining it with the feature extraction block in the camera identification model. Finally, we transfer the backdoor from the elaborated special network architecture to the classical architecture model via teacher-student distillation learning. Since the trigger of our method is related to the specific phone, our attack works effectively in the physical world. Experiment results demonstrate the feasibility of our proposed approach and robustness against various backdoor defenses.
翻译:后门攻击旨在破坏模型,使其在出现特定触发模式时返回攻击者期望的输出,而对干净输入则表现正常。当前的后门攻击需要修改干净图像的像素,导致攻击隐蔽性差,并增加了物理实现的难度。本文提出了一种基于相机成像的物理不可见后门,无需改变自然图像像素。具体而言,受损模型对特定相机拍摄的图像返回目标标签,而对其余图像返回正确结果。为实施并评估所提出的后门,我们使用多部智能手机从多角度拍摄不同物体,构建了一个包含21,500张图像的新数据集。在该数据集上,常规后门攻击对某些经典模型(如ResNet18)效果不佳。因此,我们提出了一种三步训练策略来实施后门攻击。首先,设计并训练一个带有手机ID的相机识别模型,以提取相机指纹特征。随后,利用CFA插值算法的属性,结合相机识别模型中的特征提取模块,精心设计一种易受后门攻击的特殊网络架构。最后,通过师生蒸馏学习,将后门从精心设计的特殊网络架构迁移至经典架构模型。由于我们的方法触发与特定手机相关,该攻击在物理世界中效果显著。实验结果证明了所提方法的可行性及其对多种后门防御的鲁棒性。