3D object detection using LiDAR point clouds is a fundamental task in the fields of computer vision, robotics, and autonomous driving. However, existing 3D detectors heavily rely on annotated datasets, which are both time-consuming and prone to errors during the process of labeling 3D bounding boxes. In this paper, we propose a Scene Completion Pre-training (SCP) method to enhance the performance of 3D object detectors with less labeled data. SCP offers three key advantages: (1) Improved initialization of the point cloud model. By completing the scene point clouds, SCP effectively captures the spatial and semantic relationships among objects within urban environments. (2) Elimination of the need for additional datasets. SCP serves as a valuable auxiliary network that does not impose any additional efforts or data requirements on the 3D detectors. (3) Reduction of the amount of labeled data for detection. With the help of SCP, the existing state-of-the-art 3D detectors can achieve comparable performance while only relying on 20% labeled data.
翻译:基于LiDAR点云的3D目标检测是计算机视觉、机器人和自动驾驶领域的基础任务。然而,现有的3D检测器严重依赖标注数据集,而标注3D边界框的过程既耗时又容易出错。本文提出了一种场景补全预训练(SCP)方法,以在使用较少标注数据的情况下提升3D目标检测器的性能。SCP具有三个关键优势:(1)改进点云模型的初始化。通过补全场景点云,SCP能有效捕捉城市环境中物体间的空间和语义关系。(2)无需额外数据集。SCP作为一个有价值的辅助网络,不会给3D检测器带来任何额外工作或数据要求。(3)减少检测所需的标注数据量。借助SCP,现有的最先进3D检测器仅需使用20%的标注数据即可达到相当的性能。