Place recognition, an algorithm to recognize the re-visited places, plays the role of back-end optimization trigger in a full SLAM system. Many works equipped with deep learning tools, such as MLP, CNN, and transformer, have achieved great improvements in this research field. Point cloud transformer is one of the excellent frameworks for place recognition applied in robotics, but with large memory consumption and expensive computation, it is adverse to widely deploy the various point cloud transformer networks in mobile or embedded devices. To solve this issue, we propose a binary point cloud transformer for place recognition. As a result, a 32-bit full-precision model can be reduced to a 1-bit model with less memory occupation and faster binarized bitwise operations. To our best knowledge, this is the first binary point cloud transformer that can be deployed on mobile devices for online applications such as place recognition. Experiments on several standard benchmarks demonstrate that the proposed method can get comparable results with the corresponding full-precision transformer model and even outperform some full-precision deep learning methods. For example, the proposed method achieves 93.28% at the top @1% and 85.74% at the top @1% on the Oxford RobotCar dataset in terms of the metric of the average recall rate. Meanwhile, the size and floating point operations of the model with the same transformer structure reduce 56.1% and 34.1% respectively from original precision to binary precision.
翻译:地点识别作为一种识别重访地点的算法,在完整SLAM系统中充当后端优化触发器。许多利用深度学习工具(如MLP、CNN和Transformer)的研究工作已在该领域取得显著进展。点云Transformer是应用于机器人领域地点识别的优秀框架之一,但其存在内存消耗大、计算成本高的问题,不利于在移动或嵌入式设备中广泛部署各类点云Transformer网络。为解决这一问题,我们提出了一种面向地点识别的二值化点云Transformer。由此,原本32位全精度模型可降为1位模型,具备更低内存占用和更快的二值化位运算速度。据我们所知,这是首个可部署于移动设备进行地点识别等在线应用的二值化点云Transformer。在多个标准基准上的实验表明,所提方法能够获得与对应全精度Transformer模型相当的结果,甚至超越部分全精度深度学习方法。例如,在Oxford RobotCar数据集上,以平均召回率指标衡量,所提方法在前1%得分中达到93.28%,在前1%得分中达到85.74%。同时,相同Transformer结构的模型在从原始精度降为二值精度后,其尺寸与浮点运算量分别降低了56.1%和34.1%。