The prevalent approaches of unsupervised 3D object detection follow cluster-based pseudo-label generation and iterative self-training processes. However, the challenge arises due to the sparsity of LiDAR scans, which leads to pseudo-labels with erroneous size and position, resulting in subpar detection performance. To tackle this problem, this paper introduces a Commonsense Prototype-based Detector, termed CPD, for unsupervised 3D object detection. CPD first constructs Commonsense Prototype (CProto) characterized by high-quality bounding box and dense points, based on commonsense intuition. Subsequently, CPD refines the low-quality pseudo-labels by leveraging the size prior from CProto. Furthermore, CPD enhances the detection accuracy of sparsely scanned objects by the geometric knowledge from CProto. CPD outperforms state-of-the-art unsupervised 3D detectors on Waymo Open Dataset (WOD), PandaSet, and KITTI datasets by a large margin. Besides, by training CPD on WOD and testing on KITTI, CPD attains 90.85% and 81.01% 3D Average Precision on easy and moderate car classes, respectively. These achievements position CPD in close proximity to fully supervised detectors, highlighting the significance of our method. The code will be available at https://github.com/hailanyi/CPD.
翻译:当前主流的无监督三维目标检测方法遵循基于聚类的伪标签生成与迭代自训练流程。然而,由于激光雷达扫描数据的稀疏性,生成的伪标签在尺寸与位置上常存在误差,导致检测性能不佳。为解决此问题,本文提出了一种基于常识原型的检测器(Commonsense Prototype-based Detector, CPD),用于无监督三维目标检测。CPD首先依据常识直觉构建具有高质量边界框与密集点云特征的常识原型(Commonsense Prototype, CProto)。随后,CPD利用CProto提供的尺寸先验信息对低质量伪标签进行优化。此外,CPD通过CProto蕴含的几何知识提升对稀疏扫描目标的检测精度。CPD在Waymo开放数据集(WOD)、PandaSet和KITTI数据集上大幅超越了当前最先进的无监督三维检测器。同时,通过在WOD上训练并在KITTI上测试,CPD在简单与中等难度汽车类别上分别达到了90.85%与81.01%的三维平均精度。这些成果使CPD的性能接近全监督检测器,凸显了本方法的重要意义。代码将在https://github.com/hailanyi/CPD 公开。