3D point clouds play a pivotal role in outdoor scene perception, especially in the context of autonomous driving. Recent advancements in 3D LiDAR segmentation often focus intensely on the spatial positioning and distribution of points for accurate segmentation. However, these methods, while robust in variable conditions, encounter challenges due to sole reliance on coordinates and point intensity, leading to poor isometric invariance and suboptimal segmentation. To tackle this challenge, our work introduces Range-Aware Pointwise Distance Distribution (RAPiD) features and the associated RAPiD-Seg architecture. Our RAPiD features exhibit rigid transformation invariance and effectively adapt to variations in point density, with a design focus on capturing the localized geometry of neighboring structures. They utilize inherent LiDAR isotropic radiation and semantic categorization for enhanced local representation and computational efficiency, while incorporating a 4D distance metric that integrates geometric and surface material reflectivity for improved semantic segmentation. To effectively embed high-dimensional RAPiD features, we propose a double-nested autoencoder structure with a novel class-aware embedding objective to encode high-dimensional features into manageable voxel-wise embeddings. Additionally, we propose RAPiD-Seg which incorporates a channel-wise attention fusion and two effective RAPiD-Seg variants, further optimizing the embedding for enhanced performance and generalization. Our method outperforms contemporary LiDAR segmentation work in terms of mIoU on SemanticKITTI (76.1) and nuScenes (83.6) datasets.
翻译:三维点云在室外场景感知中发挥着关键作用,尤其在自动驾驶领域。近年来三维激光雷达分割的研究进展通常高度关注点的空间位置与分布以实现精确分割。然而,这些方法虽然在多变条件下具有鲁棒性,但由于仅依赖坐标和点强度,面临等距不变性差和分割效果欠佳的挑战。为解决这一问题,本研究提出了距离感知逐点距离分布特征及相应的RAPiD-Seg架构。我们的RAPiD特征具有刚性变换不变性,能有效适应点密度的变化,其设计重点在于捕捉邻域结构的局部几何特性。该特征利用激光雷达固有的各向同性辐射特性和语义分类,以增强局部表示能力并提升计算效率,同时引入融合几何信息与表面材料反射率的四维距离度量,从而改进语义分割效果。为有效嵌入高维RAPiD特征,我们提出了一种双嵌套自编码器结构,采用新颖的类别感知嵌入目标,将高维特征编码为可管理的体素级嵌入。此外,我们提出的RAPiD-Seg架构融合了通道注意力机制,并衍生出两种高效变体,进一步优化嵌入表示以提升性能与泛化能力。本方法在SemanticKITTI(76.1)和nuScenes(83.6)数据集上的mIoU指标均优于当前主流激光雷达分割方法。