Robust Backdoor Attack with Visible, Semantic, Sample-Specific, and Compatible Triggers

Deep neural networks (DNNs) can be manipulated to exhibit specific behaviors when exposed to specific trigger patterns, without affecting their performance on benign samples, dubbed backdoor attack. Some recent research has focused on designing invisible triggers for backdoor attacks to ensure visual stealthiness, while showing high effectiveness, even under backdoor defense. However, we find that these carefully designed invisible triggers are often sensitive to visual distortion during inference, such as Gaussian blurring or environmental variations in physical scenarios. This phenomenon could significantly undermine the practical effectiveness of attacks, but has been rarely paid attention to and thoroughly investigated. To address this limitation, we define a novel trigger called the Visible, Semantic, Sample-Specific, and Compatible trigger (VSSC trigger), to achieve effective, stealthy and robust to visual distortion simultaneously. To implement it, we develop an innovative approach by utilizing the powerful capabilities of large language models for choosing the suitable trigger and text-guided image editing techniques for generating the poisoned image with the trigger. Extensive experimental results and analysis validate the effectiveness, stealthiness and robustness of the VSSC trigger. It demonstrates superior robustness to distortions compared with most digital backdoor attacks and allows more efficient and flexible trigger integration compared to physical backdoor attacks. We hope that the proposed VSSC trigger and implementation approach could inspire future studies on designing more practical triggers in backdoor attacks.

翻译：深度神经网络（DNN）在暴露于特定触发模式时可能被操控表现出特定行为，同时不影响其在良性样本上的性能——这被称为后门攻击。近期一些研究聚焦于设计不可见触发器以实现视觉隐蔽性，即便在后门防御下仍展现出高有效性。然而，我们发现这些精心设计的不可见触发器在推理过程中往往对视觉失真敏感，例如高斯模糊或物理场景中的环境变化。这一现象可能显著削弱攻击的实际有效性，但鲜少受到关注且未被充分探究。为解决此局限，我们定义了一种新型触发器——可见、语义、样本特定且兼容的触发器（VSSC触发器），旨在同时实现攻击的有效性、隐蔽性及对视觉失真的鲁棒性。为实施该触发器，我们创新性地利用大型语言模型选择合适触发器的强大能力，并结合文本引导图像编辑技术生成带有触发器的中毒图像。广泛的实验结果与分析验证了VSSC触发器的有效性、隐蔽性和鲁棒性。相较多数数字后门攻击，它展现出更强的失真鲁棒性；相较于物理后门攻击，它允许更高效灵活的触发器集成。我们期望提出的VSSC触发器及其实现方法能启发未来关于设计更实用后门攻击触发器的研究。