Towards Open World Active Learning for 3D Object Detection

Significant strides have been made in closed world 3D object detection, testing systems in environments with known classes. However, the challenge arises in open world scenarios where new object classes appear. Existing efforts sequentially learn novel classes from streams of labeled data at a significant annotation cost, impeding efficient deployment to the wild. To seek effective solutions, we investigate a more practical yet challenging research task: Open World Active Learning for 3D Object Detection (OWAL-3D), aiming at selecting a small number of 3D boxes to annotate while maximizing detection performance on both known and unknown classes. The core difficulty centers on striking a balance between mining more unknown instances and minimizing the labeling expenses of point clouds. Empirically, our study finds the harmonious and inverse relationship between box quantities and their confidences can help alleviate the dilemma, avoiding the repeated selection of common known instances and focusing on uncertain objects that are potentially unknown. We unify both relational constraints into a simple and effective AL strategy namely OpenCRB, which guides to acquisition of informative point clouds with the least amount of boxes to label. Furthermore, we develop a comprehensive codebase for easy reproducing and future research, supporting 15 baseline methods (i.e., active learning, out-of-distribution detection and open world detection), 2 types of modern 3D detectors (i.e., one-stage SECOND and two-stage PV-RCNN) and 3 benchmark 3D datasets (i.e., KITTI, nuScenes and Waymo). Extensive experiments evidence that the proposed Open-CRB demonstrates superiority and flexibility in recognizing both novel and shared categories with very limited labeling costs, compared to state-of-the-art baselines.

翻译：在封闭世界三维目标检测领域已取得显著进展，系统可在已知类别的环境中进行测试。然而，当新类别物体出现在开放世界场景时，挑战随之而来。现有方法通过持续从标注数据流中学习新类别，但需要高昂的标注成本，阻碍了在真实场景中的高效部署。为寻求有效方案，我们探索了一个更具实践性且更具挑战性的研究任务：面向三维目标检测的开放世界主动学习（OWAL-3D），旨在通过选择少量三维框进行标注，同时最大化对已知和未知类别的检测性能。其核心难点在于平衡挖掘更多未知实例与最小化点云标注成本之间的矛盾。经验上，我们的研究发现框数量与其置信度之间的和谐与逆关系有助于缓解这一困境，避免重复选择常见的已知实例，同时聚焦于可能未知的不确定性物体。我们将这两种关系约束统一为一种简单而有效的主动学习策略——OpenCRB，该策略指导系统以最少标注框数量获取信息丰富的点云。此外，我们开发了一个综合性代码库，便于复现和未来研究，支持15种基线方法（包括主动学习、分布外检测和开放世界检测）、两种现代三维检测器（单阶段SECOND和双阶段PV-RCNN）以及三个基准三维数据集（KITTI、nuScenes和Waymo）。大量实验证明，与最先进的基线方法相比，所提出的OpenCRB在识别新颖和共享类别方面展现出优越性和灵活性，且标注成本极低。