Place recognition is a fundamental task for robotic application, allowing robots to perform loop closure detection within simultaneous localization and mapping (SLAM), and achieve relocalization on prior maps. Current range image-based networks use single-column convolution to maintain feature invariance to shifts in image columns caused by LiDAR viewpoint change.However, this raises the issues such as "restricted receptive fields" and "excessive focus on local regions", degrading the performance of networks. To address the aforementioned issues, we propose a lightweight circular convolutional Transformer network denoted as CCTNet, which boosts performance by capturing structural information in point clouds and facilitating crossdimensional interaction of spatial and channel information. Initially, a Circular Convolution Module (CCM) is introduced, expanding the network's perceptual field while maintaining feature consistency across varying LiDAR perspectives. Then, a Range Transformer Module (RTM) is proposed, which enhances place recognition accuracy in scenarios with movable objects by employing a combination of channel and spatial attention mechanisms. Furthermore, we propose an Overlap-based loss function, transforming the place recognition task from a binary loop closure classification into a regression problem linked to the overlap between LiDAR frames. Through extensive experiments on the KITTI and Ford Campus datasets, CCTNet surpasses comparable methods, achieving Recall@1 of 0.924 and 0.965, and Recall@1% of 0.990 and 0.993 on the test set, showcasing a superior performance. Results on the selfcollected dataset further demonstrate the proposed method's potential for practical implementation in complex scenarios to handle movable objects, showing improved generalization in various datasets.
翻译:位置识别是机器人应用中的基础任务,使机器人能够在同步定位与地图构建(SLAM)中实现闭环检测,并在先验地图上完成重定位。当前基于距离图像的网络采用单列卷积来保持对激光雷达视角变化引起的图像列偏移的特征不变性。然而,这引发了"受限感受野"和"过度关注局部区域"等问题,降低了网络性能。为解决上述问题,我们提出一种轻量级圆形卷积Transformer网络——CCTNet,通过捕获点云中的结构信息并促进空间与通道信息的跨维度交互来提升性能。首先,引入圆形卷积模块(CCM),在保持不同激光雷达视角下特征一致性的同时扩展网络感知范围。其次,提出距离Transformer模块(RTM),通过结合通道注意力机制与空间注意力机制,增强可移动物体场景下的位置识别精度。此外,我们提出基于重叠度的损失函数,将位置识别任务从二元闭环分类问题转化为与激光雷达帧间重叠度相关的回归问题。在KITTI和Ford Campus数据集上的大量实验表明,CCTNet超越了同类方法,在测试集上分别取得0.924和0.965的Recall@1、0.990和0.993的Recall@1%,展现出卓越性能。自采集数据集上的结果进一步证明了该方法在复杂场景中处理可移动物体的实际应用潜力,并在多数据集上显示出更强的泛化能力。