We tackle semi-supervised object detection based on motion cues. Recent results suggest that heuristic-based clustering methods in conjunction with object trackers can be used to pseudo-label instances of moving objects and use these as supervisory signals to train 3D object detectors in Lidar data without manual supervision. We re-think this approach and suggest that both, object detection, as well as motion-inspired pseudo-labeling, can be tackled in a data-driven manner. We leverage recent advances in scene flow estimation to obtain point trajectories from which we extract long-term, class-agnostic motion patterns. Revisiting correlation clustering in the context of message passing networks, we learn to group those motion patterns to cluster points to object instances. By estimating the full extent of the objects, we obtain per-scan 3D bounding boxes that we use to supervise a Lidar object detection network. Our method not only outperforms prior heuristic-based approaches (57.5 AP, +14 improvement over prior work), more importantly, we show we can pseudo-label and train object detectors across datasets.
翻译:我们基于运动线索处理半监督目标检测问题。近期研究表明,基于启发式的聚类方法结合目标跟踪器,可用于对运动目标实例进行伪标注,并在无需人工监督的条件下,将这些伪标签作为监督信号训练激光雷达数据中的3D目标检测器。我们重新审视这一方法,提出目标检测与运动启发的伪标注均可通过数据驱动方式实现。利用场景流估计领域的最新进展,获取点轨迹,从中提取长期、类别无关的运动模式。重新探讨消息传递网络中的相关性聚类问题,通过学习将运动模式分组,实现点云到目标实例的聚类。通过估计物体的完整轮廓,获得每帧扫描的3D边界框,用于监督激光雷达目标检测网络。我们的方法不仅优于先前基于启发式的方法(57.5 AP,较先前工作提升14%),更重要的是,我们证明了可跨数据集进行伪标注与目标检测器的训练。