In recent years, camera-based 3D object detection has gained widespread attention for its ability to achieve high performance with low computational cost. However, the robustness of these methods to adversarial attacks has not been thoroughly examined, especially when considering their deployment in safety-critical domains like autonomous driving. In this study, we conduct the first comprehensive investigation of the robustness of leading camera-based 3D object detection approaches under various adversarial conditions. We systematically analyze the resilience of these models under two attack settings: white-box and black-box; focusing on two primary objectives: classification and localization. Additionally, we delve into two types of adversarial attack techniques: pixel-based and patch-based. Our experiments yield four interesting findings: (a) bird's-eye-view-based representations exhibit stronger robustness against localization attacks; (b) depth-estimation-free approaches have the potential to show stronger robustness; (c) accurate depth estimation effectively improves robustness for depth-estimation-based methods; (d) incorporating multi-frame benign inputs can effectively mitigate adversarial attacks. We hope our findings can steer the development of future camera-based object detection models with enhanced adversarial robustness.
翻译:近年来,基于摄像头的3D目标检测因其在低计算成本下实现高性能的能力而受到广泛关注。然而,这些方法对对抗攻击的鲁棒性尚未得到充分研究,尤其是在考虑其部署于自动驾驶等安全关键领域时。本研究首次全面探究了主流基于摄像头的3D目标检测方法在各种对抗条件下的鲁棒性。我们系统分析了这些模型在白盒与黑盒两种攻击设置下的抗干扰能力,重点关注分类与定位两个主要目标。此外,我们深入研究了基于像素和基于补丁两类对抗攻击技术。实验得出了四点有趣发现:(a)基于鸟瞰图的表示对定位攻击表现出更强的鲁棒性;(b)免深度估计方法具有展现更强鲁棒性的潜力;(c)准确的深度估计可有效提升基于深度估计方法的鲁棒性;(d)融入多帧良性输入能有效缓解对抗攻击。我们希望这些发现能引导未来开发具有更强对抗鲁棒性的基于摄像头的目标检测模型。