Moving object segmentation (MOS) provides a reliable solution for detecting traffic participants and thus is of great interest in the autonomous driving field. Dynamic capture is always critical in the MOS problem. Previous methods capture motion features from the range images directly. Differently, we argue that the residual maps provide greater potential for motion information, while range images contain rich semantic guidance. Based on this intuition, we propose MF-MOS, a novel motion-focused model with a dual-branch structure for LiDAR moving object segmentation. Novelly, we decouple the spatial-temporal information by capturing the motion from residual maps and generating semantic features from range images, which are used as movable object guidance for the motion branch. Our straightforward yet distinctive solution can make the most use of both range images and residual maps, thus greatly improving the performance of the LiDAR-based MOS task. Remarkably, our MF-MOS achieved a leading IoU of 76.7% on the MOS leaderboard of the SemanticKITTI dataset upon submission, demonstrating the current state-of-the-art performance. The implementation of our MF-MOS has been released at https://github.com/SCNU-RISLAB/MF-MOS.
翻译:运动目标分割(MOS)为检测交通参与者提供了可靠的解决方案,因此在自动驾驶领域备受关注。动态捕捉在MOS问题中始终至关重要。以往方法直接从距离图像中提取运动特征。与现有方法不同,我们认为残差图为运动信息提供了更大潜力,而距离图像则包含丰富的语义引导。基于这一理念,我们提出MF-MOS——一种面向激光雷达运动目标分割的新型双分支结构运动聚焦模型。创新性地,我们通过从残差图中捕捉运动信息、从距离图像中生成语义特征来解耦时空信息,其中语义特征作为可移动目标引导输入运动分支。这种简洁而独特的解决方案能够充分利用距离图像与残差图的双重优势,从而显著提升基于激光雷达的MOS任务性能。值得注意的是,在提交时,我们的MF-MOS在SemanticKITTI数据集MOS排行榜上以76.7%的交并比(IoU)领先,展现了当前最先进的性能。MF-MOS的实现已开源至https://github.com/SCNU-RISLAB/MF-MOS。