We tackle semi-supervised object detection based on motion cues. Recent results suggest that heuristic-based clustering methods in conjunction with object trackers can be used to pseudo-label instances of moving objects and use these as supervisory signals to train 3D object detectors in Lidar data without manual supervision. We re-think this approach and suggest that both, object detection, as well as motion-inspired pseudo-labeling, can be tackled in a data-driven manner. We leverage recent advances in scene flow estimation to obtain point trajectories from which we extract long-term, class-agnostic motion patterns. Revisiting correlation clustering in the context of message passing networks, we learn to group those motion patterns to cluster points to object instances. By estimating the full extent of the objects, we obtain per-scan 3D bounding boxes that we use to supervise a Lidar object detection network. Our method not only outperforms prior heuristic-based approaches (57.5 AP, +14 improvement over prior work), more importantly, we show we can pseudo-label and train object detectors across datasets.
翻译:我们基于运动线索处理半监督目标检测问题。近期研究表明,结合启发式聚类方法与目标跟踪器,可在无需人工标注的情况下,为激光雷达数据中运动目标的实例生成伪标签,并将其作为监督信号训练三维目标检测网络。我们重新审视该思路,提出目标检测与运动启发式伪标签生成均可通过数据驱动方式实现。借助场景流估计的最新进展,我们从点轨迹中提取长期、类别无关的运动模式。在消息传递网络的框架下重新审视相关聚类问题,通过学习分组这些运动模式实现点云实例的聚类。通过估计目标的完整空间范围,我们获得单扫描三维边界框,用于监督激光雷达目标检测网络的训练。本方法不仅显著超越基于启发式的先前方法(AP达57.5,较先前工作提升14%),更重要的是,我们证实该方法可在跨数据集场景下进行伪标签生成与目标检测器训练。