Though deep neural models adopted to realize the perception of autonomous driving have proven vulnerable to adversarial examples, known attacks often leverage 2D patches and target mostly monocular perception. Therefore, the effectiveness of Physical Adversarial Examples (PAEs) on stereo-based binocular depth estimation remains largely unexplored. To this end, we propose the first texture-enabled physical adversarial attack against stereo matching models in the context of autonomous driving. Our method employs a 3D PAE with global camouflage texture rather than a local 2D patch-based one, ensuring both visual consistency and attack effectiveness across different viewpoints of stereo cameras. To cope with the disparity effect of these cameras, we also propose a new 3D stereo matching rendering module that allows the PAE to be aligned with real-world positions and headings in binocular vision. We further propose a novel merging attack that seamlessly blends the target into the environment through fine-grained PAE optimization. It has significantly enhanced stealth and lethality upon existing hiding attacks that fail to get seamlessly merged into the background. Extensive evaluations show that our PAEs can successfully fool the stereo models into producing erroneous depth information.
翻译:尽管用于实现自动驾驶感知的深度神经网络模型已被证明易受对抗样本攻击,但已知攻击通常利用二维补丁并主要针对单目感知。因此,物理对抗样本在基于立体的双目深度估计上的有效性在很大程度上仍未得到探索。为此,我们提出了首个针对自动驾驶场景下立体匹配模型的、具备纹理的物理对抗攻击方法。我们的方法采用具有全局伪装纹理的三维物理对抗样本,而非基于局部二维补丁的样本,从而确保在立体相机不同视角下兼具视觉一致性与攻击有效性。为应对这些相机的视差效应,我们还提出了一种新的三维立体匹配渲染模块,使物理对抗样本能够在双目视觉中与现实世界的位置和朝向对齐。我们进一步提出了一种新颖的融合攻击,通过细粒度的物理对抗样本优化,将目标物体无缝地融入环境背景中。相较于现有无法与背景无缝融合的隐藏攻击,该方法显著增强了隐蔽性与杀伤力。大量评估表明,我们的物理对抗样本能够成功欺骗立体模型,使其产生错误的深度信息。