Object detection is fundamental to various real-world applications, such as security monitoring and surveillance video analysis. Despite their advancements, state-of-the-art object detectors are still vulnerable to adversarial patch attacks, which can be easily applied to real-world objects to either conceal actual items or create non-existent ones, leading to severe consequences. In this work, we introduce DisPatch, the first diffusion-based defense framework for object detection. Unlike previous works that aim to "detect and remove" adversarial patches, DisPatch adopts a "regenerate and rectify" strategy, leveraging generative models to disarm attack effects while preserving the integrity of the input image. Specifically, we utilize the in-distribution generative power of diffusion models to regenerate the entire image, aligning it with benign data. A rectification process is then employed to identify and replace adversarial regions with their regenerated benign counterparts. DisPatch is attack-agnostic and requires no prior knowledge of the existing patches. Extensive experiments across multiple detectors demonstrate that DisPatch consistently outperforms state-of-the-art defenses on both hiding attacks and creating attacks, achieving the best overall [email protected] score of 89.3% on hiding attacks, and lowering the attack success rate to 24.8% on untargeted creating attacks. Moreover, it strikes the balance between effectiveness and efficiency, and maintains strong robustness against adaptive attacks, making it a practical and reliable defense method.
翻译:目标检测是安防监控、视频分析等实际应用的基础。尽管现有最先进的目标检测器取得了长足进步,但其仍易受对抗性补丁攻击的影响——此类攻击可轻易应用于真实世界物体,或隐藏真实目标,或伪造虚假目标,从而引发严重后果。本文提出首个基于扩散模型的检测防御框架DisPatch。与以往"检测并移除"对抗性补丁的方法不同,DisPatch采用"再生与修正"策略,利用生成模型瓦解攻击效果的同时保持输入图像的完整性。具体而言,我们利用扩散模型的原分布生成能力对全图进行符合良性数据分布的再生,随后通过修正流程识别并替换被攻击区域为再生后的良性对应内容。DisPatch无需预知攻击类型,也无需掌握现有补丁的先验知识。在多种检测器上的大量实验表明,针对隐藏攻击与伪造攻击,DisPatch均持续优于现有最优防御方法:在隐藏攻击场景下取得89.3%的最佳整体[email protected]指标,在非定向伪造攻击中将攻击成功率降至24.8%。此外,该方法在有效性与效率间取得平衡,并对自适应攻击保持强鲁棒性,是一种实用可靠的防御方法。