Current point cloud registration methods are mainly based on local geometric information and usually ignore the semantic information contained in the scenes. In this paper, we treat the point cloud registration problem as a semantic instance matching and registration task, and propose a deep semantic graph matching method (DeepSGM) for large-scale outdoor point cloud registration. Firstly, the semantic categorical labels of 3D points are obtained using a semantic segmentation network. The adjacent points with the same category labels are then clustered together using the Euclidean clustering algorithm to obtain the semantic instances, which are represented by three kinds of attributes including spatial location information, semantic categorical information, and global geometric shape information. Secondly, the semantic adjacency graph is constructed based on the spatial adjacency relations of semantic instances. To fully explore the topological structures between semantic instances in the same scene and across different scenes, the spatial distribution features and the semantic categorical features are learned with graph convolutional networks, and the global geometric shape features are learned with a PointNet-like network. These three kinds of features are further enhanced with the self-attention and cross-attention mechanisms. Thirdly, the semantic instance matching is formulated as an optimal transport problem, and solved through an optimal matching layer. Finally, the geometric transformation matrix between two point clouds is first estimated by the SVD algorithm and then refined by the ICP algorithm. Experimental results conducted on the KITTI Odometry dataset demonstrate that the proposed method improves the registration performance and outperforms various state-of-the-art methods.
翻译:当前点云配准方法主要基于局部几何信息,通常忽略了场景中包含的语义信息。本文将点云配准问题视为语义实例匹配与配准任务,提出一种用于大规模室外点云配准的深度语义图匹配方法(DeepSGM)。首先,利用语义分割网络获取三维点的语义类别标签,随后通过欧几里得聚类算法将具有相同类别标签的相邻点聚类为语义实例,并以空间位置信息、语义类别信息和全局几何形状信息三种属性进行表征。其次,基于语义实例的空间邻接关系构建语义邻接图。为充分挖掘同一场景及跨场景中语义实例间的拓扑结构,采用图卷积网络学习空间分布特征与语义类别特征,并采用类似PointNet的网络学习全局几何形状特征。上述三类特征进一步通过自注意力机制和交叉注意力机制进行增强。再次,将语义实例匹配问题建模为最优传输问题,并通过最优匹配层求解。最后,利用奇异值分解(SVD)算法初步估计两片点云间的几何变换矩阵,再通过迭代最近点(ICP)算法进行精化。在KITTI里程计数据集上的实验结果表明,所提方法提升了配准性能,并优于多种现有最优方法。