The perception of moving objects is crucial for autonomous robots performing collision avoidance in dynamic environments. LiDARs and cameras tremendously enhance scene interpretation but do not provide direct motion information and face limitations under adverse weather. Radar sensors overcome these limitations and provide Doppler velocities, delivering direct information on dynamic objects. In this paper, we address the problem of moving instance segmentation in radar point clouds to enhance scene interpretation for safety-critical tasks. Our Radar Instance Transformer enriches the current radar scan with temporal information without passing aggregated scans through a neural network. We propose a full-resolution backbone to prevent information loss in sparse point cloud processing. Our instance transformer head incorporates essential information to enhance segmentation but also enables reliable, class-agnostic instance assignments. In sum, our approach shows superior performance on the new moving instance segmentation benchmarks, including diverse environments, and provides model-agnostic modules to enhance scene interpretation. The benchmark is based on the RadarScenes dataset and will be made available upon acceptance.
翻译:运动物体的感知对于自主机器人在动态环境中执行碰撞避免至关重要。激光雷达和摄像头极大地增强了场景理解能力,但无法提供直接的运动信息,并在恶劣天气下面临局限性。雷达传感器克服了这些局限性,并提供多普勒速度,从而传递动态物体的直接信息。本文针对雷达点云中的运动实例分割问题展开研究,以增强安全关键任务中的场景理解能力。我们的雷达实例Transformer在不通过神经网络传递聚合扫描的情况下,用时间信息丰富了当前的雷达扫描。我们提出了一种全分辨率骨干网络,以防止稀疏点云处理中的信息丢失。我们的实例Transformer头部融合了基本信息以提升分割效果,同时实现了可靠的、与类别无关的实例分配。总之,我们的方法在包括多样化环境的新运动实例分割基准测试中展现了优越性能,并提供了与模型无关的模块以增强场景理解。该基准基于RadarScenes数据集,将在录用后公开提供。