Deep neural networks (DNNs) have been proven extremely susceptible to adversarial examples, which raises special safety-critical concerns for DNN-based autonomous driving stacks (i.e., 3D object detection). Although there are extensive works on image-level attacks, most are restricted to 2D pixel spaces, and such attacks are not always physically realistic in our 3D world. Here we present Adv3D, the first exploration of modeling adversarial examples as Neural Radiance Fields (NeRFs). Advances in NeRF provide photorealistic appearances and 3D accurate generation, yielding a more realistic and realizable adversarial example. We train our adversarial NeRF by minimizing the surrounding objects' confidence predicted by 3D detectors on the training set. Then we evaluate Adv3D on the unseen validation set and show that it can cause a large performance reduction when rendering NeRF in any sampled pose. To generate physically realizable adversarial examples, we propose primitive-aware sampling and semantic-guided regularization that enable 3D patch attacks with camouflage adversarial texture. Experimental results demonstrate that the trained adversarial NeRF generalizes well to different poses, scenes, and 3D detectors. Finally, we provide a defense method to our attacks that involves adversarial training through data augmentation. Project page: https://len-li.github.io/adv3d-web
翻译:深度神经网络已被证实极易受到对抗样本的影响,这对基于深度神经网络的自动驾驶系统(即三维目标检测)提出了特殊的安全关键性担忧。尽管已有大量针对图像层面的攻击研究,但大多局限于二维像素空间,且此类攻击在我们的三维世界中并不总是物理上真实的。本文提出Adv3D,首次探索将对抗样本建模为神经辐射场。NeRF技术的进展提供了逼真的外观和精确的三维生成能力,从而产生更真实且更易实现的对抗样本。我们通过在训练集上最小化三维检测器对周围物体预测的置信度来训练对抗性NeRF。随后在未见过的验证集上评估Adv3D,结果表明当从任意采样视角渲染NeRF时,其能导致检测性能大幅下降。为生成物理上可实现的对抗样本,我们提出了基元感知采样和语义引导正则化方法,以实现具有伪装对抗纹理的三维贴片攻击。实验结果表明,训练得到的对抗性NeRF能很好地泛化到不同视角、场景和三维检测器。最后,我们提出一种针对该攻击的防御方法,即通过数据增强进行对抗训练。项目页面:https://len-li.github.io/adv3d-web