3D semantic segmentation on multi-scan large-scale point clouds plays an important role in autonomous systems. Unlike the single-scan-based semantic segmentation task, this task requires distinguishing the motion states of points in addition to their semantic categories. However, methods designed for single-scan-based segmentation tasks perform poorly on the multi-scan task due to the lacking of an effective way to integrate temporal information. We propose MarS3D, a plug-and-play motion-aware module for semantic segmentation on multi-scan 3D point clouds. This module can be flexibly combined with single-scan models to allow them to have multi-scan perception abilities. The model encompasses two key designs: the Cross-Frame Feature Embedding module for enriching representation learning and the Motion-Aware Feature Learning module for enhancing motion awareness. Extensive experiments show that MarS3D can improve the performance of the baseline model by a large margin. The code is available at https://github.com/CVMI-Lab/MarS3D.
翻译:多扫描大规模点云上的3D语义分割在自主系统中扮演着重要角色。与基于单扫描的语义分割任务不同,该任务除了需要区分点的语义类别外,还需辨别其运动状态。然而,由于缺乏有效整合时序信息的方法,专为单扫描分割任务设计的方法在多扫描任务中表现不佳。我们提出MarS3D,一种用于多扫描3D点云语义分割的即插即用运动感知模块。该模块可灵活与单扫描模型结合,使其具备多扫描感知能力。模型包含两个关键设计:用于增强表征学习的跨帧特征嵌入模块和用于提升运动感知能力的运动感知特征学习模块。大量实验表明,MarS3D能够大幅提升基线模型的性能。代码开源地址:https://github.com/CVMI-Lab/MarS3D。