Deep neural network-based LiDAR 3D object detection serves as a critical perception component in safety-critical autonomous systems. However, recent studies have revealed its vulnerability to backdoor attacks. Existing attacks typically require white-box access or label modification and focus on geometric attacks such as object disappearance or bounding-box manipulation. In this paper, we present Mirage, a black-box and clean-label backdoor attack against deep neural network-based LiDAR 3DOD. Mirage injects a small number of label-consistent poisoning samples into the training set, causing the model to learn a malicious association between a trigger pattern and an attacker-chosen target class while preserving normal training semantics. As a result, the compromised model behaves normally on benign inputs yet systematically misclassifies triggered objects as the target class during deployment. We evaluate Mirage on multiple state-of-the-art LiDAR 3DOD models and benchmark datasets. Experimental results show that Mirage achieves a 73% misclassification success rate with a poisoning rate of only 0.5%, while maintaining detection performance close to that of benign models.
翻译:暂无翻译