Deep learning based person re-identification (re-id) models have been widely employed in surveillance systems. Recent studies have demonstrated that black-box single-modality and cross-modality re-id models are vulnerable to adversarial examples (AEs), leaving the robustness of multi-modality re-id models unexplored. Due to the lack of knowledge about the specific type of model deployed in the target black-box surveillance system, we aim to generate modality unified AEs for omni-modality (single-, cross- and multi-modality) re-id models. Specifically, we propose a novel Modality Unified Attack method to train modality-specific adversarial generators to generate AEs that effectively attack different omni-modality models. A multi-modality model is adopted as the surrogate model, wherein the features of each modality are perturbed by metric disruption loss before fusion. To collapse the common features of omni-modality models, Cross Modality Simulated Disruption approach is introduced to mimic the cross-modality feature embeddings by intentionally feeding images to non-corresponding modality-specific subnetworks of the surrogate model. Moreover, Multi Modality Collaborative Disruption strategy is devised to facilitate the attacker to comprehensively corrupt the informative content of person images by leveraging a multi modality feature collaborative metric disruption loss. Extensive experiments show that our MUA method can effectively attack the omni-modality re-id models, achieving 55.9%, 24.4%, 49.0% and 62.7% mean mAP Drop Rate, respectively.
翻译:基于深度学习的行人重识别模型已广泛应用于监控系统。近期研究表明,黑盒单模态与跨模态行人重识别模型易受对抗样本攻击,而多模态行人重识别模型的鲁棒性尚未得到充分探究。由于无法预知目标黑盒监控系统具体部署的模型类型,本研究旨在生成适用于全模态(单模态、跨模态及多模态)行人重识别模型的模态统一对抗样本。具体而言,我们提出一种新颖的模态统一攻击方法,通过训练模态特定的对抗生成器来生成能有效攻击不同全模态模型的对抗样本。该方法采用多模态模型作为代理模型,在特征融合前通过度量破坏损失对每个模态的特征施加扰动。为瓦解全模态模型的共性特征,我们引入跨模态模拟破坏方法,通过将图像故意输入代理模型的非对应模态子网络来模拟跨模态特征嵌入。此外,设计多模态协同破坏策略,利用多模态特征协同度量破坏损失,使攻击者能够全面破坏行人图像的信息内容。大量实验表明,我们的MUA方法能有效攻击全模态行人重识别模型,在四项指标上分别达到55.9%、24.4%、49.0%和62.7%的平均mAP下降率。