LiDAR segmentation has become a crucial component of advanced autonomous driving systems. Recent range-view LiDAR segmentation approaches show promise for real-time processing. However, they inevitably suffer from corrupted contextual information and rely heavily on post-processing techniques for prediction refinement. In this work, we propose FRNet, a simple yet powerful method aimed at restoring the contextual information of range image pixels using corresponding frustum LiDAR points. First, a frustum feature encoder module is used to extract per-point features within the frustum region, which preserves scene consistency and is critical for point-level predictions. Next, a frustum-point fusion module is introduced to update per-point features hierarchically, enabling each point to extract more surrounding information through the frustum features. Finally, a head fusion module is used to fuse features at different levels for final semantic predictions. Extensive experiments conducted on four popular LiDAR segmentation benchmarks under various task setups demonstrate the superiority of FRNet. Notably, FRNet achieves 73.3% and 82.5% mIoU scores on the testing sets of SemanticKITTI and nuScenes. While achieving competitive performance, FRNet operates 5 times faster than state-of-the-art approaches. Such high efficiency opens up new possibilities for more scalable LiDAR segmentation. The code has been made publicly available at https://github.com/Xiangxu-0103/FRNet.
翻译:LiDAR分割已成为先进自动驾驶系统的关键组成部分。近期基于距离视图的LiDAR分割方法展现出实时处理的潜力,但不可避免地存在上下文信息受损的问题,且高度依赖后处理技术进行预测优化。本研究提出FRNet,一种简洁而强大的方法,旨在利用对应视锥内的LiDAR点云恢复距离图像像素的上下文信息。首先,通过视锥特征编码模块提取视锥区域内各点的特征,该模块保持了场景一致性,对点级预测至关重要。随后引入视锥-点融合模块,以分层方式更新逐点特征,使每个点能通过视锥特征提取更多周边信息。最后,采用头部融合模块融合不同层级的特征以完成最终语义预测。在四种主流LiDAR分割基准数据集上进行的多任务实验表明FRNet具有显著优越性。值得注意的是,FRNet在SemanticKITTI和nuScenes测试集上分别取得了73.3%和82.5%的mIoU分数。在实现优异性能的同时,FRNet的运行速度比现有最优方法快5倍。这种高效性为更具可扩展性的LiDAR分割开辟了新可能。代码已公开于https://github.com/Xiangxu-0103/FRNet。