The prevalent approaches of unsupervised 3D object detection follow cluster-based pseudo-label generation and iterative self-training processes. However, the challenge arises due to the sparsity of LiDAR scans, which leads to pseudo-labels with erroneous size and position, resulting in subpar detection performance. To tackle this problem, this paper introduces a Commonsense Prototype-based Detector, termed CPD, for unsupervised 3D object detection. CPD first constructs Commonsense Prototype (CProto) characterized by high-quality bounding box and dense points, based on commonsense intuition. Subsequently, CPD refines the low-quality pseudo-labels by leveraging the size prior from CProto. Furthermore, CPD enhances the detection accuracy of sparsely scanned objects by the geometric knowledge from CProto. CPD outperforms state-of-the-art unsupervised 3D detectors on Waymo Open Dataset (WOD), PandaSet, and KITTI datasets by a large margin. Besides, by training CPD on WOD and testing on KITTI, CPD attains 90.85% and 81.01% 3D Average Precision on easy and moderate car classes, respectively. These achievements position CPD in close proximity to fully supervised detectors, highlighting the significance of our method. The code will be available at https://github.com/hailanyi/CPD.
翻译:主流的无监督3D目标检测方法通常采用基于聚类的伪标签生成与迭代自训练流程。然而,由于激光雷达扫描的稀疏性,生成的伪标签在尺寸和位置上存在偏差,导致检测性能不佳。针对该问题,本文提出一种基于常识原型(Commonsense Prototype)的检测器CPD,用于无监督3D目标检测。CPD首先依据常识直觉构建包含高质量边界框与密集点云的常识原型(CProto),继而利用CProto的尺寸先验信息优化低质量伪标签,并借助CProto的几何知识提升稀疏扫描目标的检测精度。在Waymo开放数据集(WOD)、PandaSet和KITTI数据集上,CPD以显著优势超越当前最先进的无监督3D检测器。此外,在WOD上训练并在KITTI上测试时,CPD在简易与中等难度轿车类别上的三维平均精度分别达到90.85%和81.01%。该性能已接近全监督检测器水平,彰显了本文方法的重要性。代码将开源至https://github.com/hailanyi/CPD。