4D LiDAR semantic segmentation, also referred to as multi-scan semantic segmentation, plays a crucial role in enhancing the environmental understanding capabilities of autonomous vehicles or robots. It classifies the semantic category of each LiDAR measurement point and detects whether it is dynamic, a critical ability for tasks like obstacle avoidance and autonomous navigation. Existing approaches often rely on computationally heavy 4D convolutions or recursive networks, which result in poor real-time performance, making them unsuitable for online robotics and autonomous driving applications. In this paper, we introduce SegNet4D, a novel real-time 4D semantic segmentation network offering both efficiency and strong semantic understanding. SegNet4D addresses 4D segmentation as two tasks: single-scan semantic segmentation and moving object segmentation, each tackled by a separate network head. Both results are combined in a motion-semantic fusion module to achieve comprehensive 4D segmentation. Additionally, instance information is extracted from the current scan and exploited for instance-wise segmentation consistency. Our approach surpasses state-of-the-art in both multi-scan semantic segmentation and moving object segmentation while offering greater efficiency, enabling real-time operation. Besides, its effectiveness and efficiency have also been validated on a real-world unmanned ground platform. Our code will be released at https://github.com/nubot-nudt/SegNet4D.
翻译:四维激光雷达语义分割,亦称为多扫描语义分割,在增强自动驾驶车辆或机器人的环境理解能力方面发挥着至关重要的作用。该任务旨在对每个激光雷达测量点进行语义类别分类,并检测其是否为动态物体,这是实现避障和自主导航等任务的关键能力。现有方法通常依赖于计算量庞大的四维卷积或递归网络,导致其实时性能较差,难以适用于在线机器人及自动驾驶应用。本文提出SegNet4D,一种新颖的实时四维语义分割网络,兼具高效性与强大的语义理解能力。SegNet4D将四维分割任务分解为两个子任务:单次扫描语义分割与运动物体分割,每个子任务由一个独立的网络头处理。两者的结果在运动-语义融合模块中进行整合,以实现全面的四维分割。此外,我们从当前扫描中提取实例信息,并利用其实现实例级的分割一致性。我们的方法在多扫描语义分割和运动物体分割任务上均超越了现有最优方法,同时具备更高的效率,可实现实时运行。此外,其有效性和高效性也在真实世界的无人地面平台上得到了验证。我们的代码将在 https://github.com/nubot-nudt/SegNet4D 发布。