This paper proposes a pose-graph attentional graph neural network, called P-GAT, which compares (key)nodes between sequential and non-sequential sub-graphs for place recognition tasks as opposed to a common frame-to-frame retrieval problem formulation currently implemented in SOTA place recognition methods. P-GAT uses the maximum spatial and temporal information between neighbour cloud descriptors -- generated by an existing encoder -- utilising the concept of pose-graph SLAM. Leveraging intra- and inter-attention and graph neural network, P-GAT relates point clouds captured in nearby locations in Euclidean space and their embeddings in feature space. Experimental results on the large-scale publically available datasets demonstrate the effectiveness of our approach in scenes lacking distinct features and when training and testing environments have different distributions (domain adaptation). Further, an exhaustive comparison with the state-of-the-art shows improvements in performance gains. Code is available at https://github.com/csiro-robotics/P-GAT.
翻译:本文提出一种名为P-GAT的位姿图注意力图神经网络,该网络通过比较序列子图与非序列子图间的(关键)节点来完成位置识别任务,有别于当前主流位置识别方法中普遍采用的帧间检索问题框架。P-GAT利用现有编码器生成的相邻云描述符之间的最大时空信息,借鉴位姿图SLAM的概念。通过融合内部注意力与交叉注意力机制及图神经网络,P-GAT能够关联欧氏空间中邻近位置采集的点云及其在特征空间中的嵌入表征。在公开大规模数据集上的实验结果表明,该方法在缺乏显著特征场景以及训练与测试环境存在分布差异(领域自适应)时均具有效性。此外,与现有最先进方法的全面对比验证了性能增益的显著提升。代码开源在https://github.com/csiro-robotics/P-GAT。