In this paper, we study the problem of unsupervised object detection from 3D point clouds in self-driving scenes. We present a simple yet effective method that exploits (i) point clustering in near-range areas where the point clouds are dense, (ii) temporal consistency to filter out noisy unsupervised detections, (iii) translation equivariance of CNNs to extend the auto-labels to long range, and (iv) self-supervision for improving on its own. Our approach, OYSTER (Object Discovery via Spatio-Temporal Refinement), does not impose constraints on data collection (such as repeated traversals of the same location), is able to detect objects in a zero-shot manner without supervised finetuning (even in sparse, distant regions), and continues to self-improve given more rounds of iterative self-training. To better measure model performance in self-driving scenarios, we propose a new planning-centric perception metric based on distance-to-collision. We demonstrate that our unsupervised object detector significantly outperforms unsupervised baselines on PandaSet and Argoverse 2 Sensor dataset, showing promise that self-supervision combined with object priors can enable object discovery in the wild. For more information, visit the project website: https://waabi.ai/research/oyster
翻译:本文研究了自动驾驶场景中从三维点云进行无监督目标检测的问题。我们提出了一种简单而有效的方法,该方法利用了以下四点:(i) 在点云密集的近距区域进行点聚类,(ii) 利用时间一致性滤除噪声无监督检测结果,(iii) 利用CNN的平移等变性将自动标签扩展到远距离范围,以及(iv) 通过自监督实现自我改进。我们的方法OYSTER(通过时空精炼的目标发现)不对数据收集施加约束(例如无需重复遍历同一地点),能够以零样本方式检测目标而无需监督微调(即使在稀疏的远处区域),并能在更多轮次的迭代自训练中持续自我提升。为了更好地衡量自动驾驶场景中的模型性能,我们提出了一种基于碰撞距离的以规划为中心的新型感知指标。实验表明,我们的无监督目标检测器在PandaSet和Argoverse 2传感器数据集上显著优于无监督基线方法,展现了自监督结合目标先验可在野外实现目标发现的潜力。更多信息请访问项目网站:https://waabi.ai/research/oyster