Exploring Active 3D Object Detection from a Generalization Perspective

To alleviate the high annotation cost in LiDAR-based 3D object detection, active learning is a promising solution that learns to select only a small portion of unlabeled data to annotate, without compromising model performance. Our empirical study, however, suggests that mainstream uncertainty-based and diversity-based active learning policies are not effective when applied in the 3D detection task, as they fail to balance the trade-off between point cloud informativeness and box-level annotation costs. To overcome this limitation, we jointly investigate three novel criteria in our framework Crb for point cloud acquisition - label conciseness}, feature representativeness and geometric balance, which hierarchically filters out the point clouds of redundant 3D bounding box labels, latent features and geometric characteristics (e.g., point cloud density) from the unlabeled sample pool and greedily selects informative ones with fewer objects to annotate. Our theoretical analysis demonstrates that the proposed criteria align the marginal distributions of the selected subset and the prior distributions of the unseen test set, and minimizes the upper bound of the generalization error. To validate the effectiveness and applicability of Crb, we conduct extensive experiments on the two benchmark 3D object detection datasets of KITTI and Waymo and examine both one-stage (i.e., Second) and two-stage 3D detectors (i.e., Pv-rcnn). Experiments evidence that the proposed approach outperforms existing active learning strategies and achieves fully supervised performance requiring $1\%$ and $8\%$ annotations of bounding boxes and point clouds, respectively. Source code: https://github.com/Luoyadan/CRB-active-3Ddet.

翻译：为缓解基于激光雷达的三维目标检测中高昂的标注成本，主动学习提供了一种有前景的解决方案——通过仅选择少量未标注数据进行标注，而无需牺牲模型性能。然而，我们的实证研究表明，主流的不确定性驱动与多样性驱动的主动学习策略在三维检测任务中效果不佳，因其未能有效平衡点云信息量与边框级标注成本之间的权衡。为突破这一局限，我们在所提出的框架Crb中联合研究了三项新颖的筛选准则——标签简洁性、特征代表性与几何平衡性。该框架从层次化角度滤除未标注样本池中冗余的三维边框标签、潜在特征及几何特征（如点云密度），并贪婪地选取信息量大且需标注对象较少的点云。理论分析表明，所提准则能使所选子集的边际分布与未知测试集的先验分布对齐，并最小化泛化误差的上界。为验证Crb的有效性与适用性，我们在KITTI与Waymo两个基准三维目标检测数据集上开展大量实验，并检验了单阶段（即Second）与两阶段（即Pv-rcnn）三维检测器。实验证明，该方法优于现有主动学习策略，且仅需分别使用$1\%$的边框标注与$8\%$的点云标注即可达到全监督性能。源代码地址：https://github.com/Luoyadan/CRB-active-3Ddet。