LiDAR segmentation has become a crucial component in advanced autonomous driving systems. Recent range-view LiDAR segmentation approaches show promise for real-time processing. However, they inevitably suffer from corrupted contextual information and rely heavily on post-processing techniques for prediction refinement. In this work, we propose FRNet, a simple yet powerful method aimed at restoring the contextual information of range image pixels using corresponding frustum LiDAR points. Firstly, a frustum feature encoder module is used to extract per-point features within the frustum region, which preserves scene consistency and is crucial for point-level predictions. Next, a frustum-point fusion module is introduced to update per-point features hierarchically, enabling each point to extract more surrounding information via the frustum features. Finally, a head fusion module is used to fuse features at different levels for final semantic prediction. Extensive experiments conducted on four popular LiDAR segmentation benchmarks under various task setups demonstrate the superiority of FRNet. Notably, FRNet achieves 73.3% and 82.5% mIoU scores on the testing sets of SemanticKITTI and nuScenes. While achieving competitive performance, FRNet operates 5 times faster than state-of-the-art approaches. Such high efficiency opens up new possibilities for more scalable LiDAR segmentation. The code has been made publicly available at https://github.com/Xiangxu-0103/FRNet.
翻译:激光雷达分割已成为先进自动驾驶系统中的关键组成部分。近期基于范围视图的激光雷达分割方法在实时处理方面展现出潜力,但不可避免地存在上下文信息受损的问题,且严重依赖后处理技术进行预测优化。本文提出FRNet——一种简单而强大的方法,旨在通过对应的截锥体激光雷达点恢复范围图像像素的上下文信息。首先,采用截锥体特征编码器模块提取截锥体区域内的逐点特征,该特征保留了场景一致性,对逐点预测至关重要。其次,引入截锥体-点融合模块实现逐点特征的分层更新,使每个点能通过截锥体特征提取更多周围信息。最后,使用头融合模块融合不同层次的特征以完成最终语义预测。在多种任务设置下基于四个主流激光雷达分割基准的大量实验表明,FRNet具有优越性。值得注意的是,FRNet在SemanticKITTI和nuScenes测试集上分别达到73.3%和82.5%的mIoU分数。在保持竞争性性能的同时,FRNet的运行速度比现有最优方法快5倍。这种高效率为更可扩展的激光雷达分割开辟了新可能。代码已开源至https://github.com/Xiangxu-0103/FRNet。