JSMNet Improving Indoor Point Cloud Semantic and Instance Segmentation through Self-Attention and Multiscale

The semantic understanding of indoor 3D point cloud data is crucial for a range of subsequent applications, including indoor service robots, navigation systems, and digital twin engineering. Global features are crucial for achieving high-quality semantic and instance segmentation of indoor point clouds, as they provide essential long-range context information. To this end, we propose JSMNet, which combines a multi-layer network with a global feature self-attention module to jointly segment three-dimensional point cloud semantics and instances. To better express the characteristics of indoor targets, we have designed a multi-resolution feature adaptive fusion module that takes into account the differences in point cloud density caused by varying scanner distances from the target. Additionally, we propose a framework for joint semantic and instance segmentation by integrating semantic and instance features to achieve superior results. We conduct experiments on S3DIS, which is a large three-dimensional indoor point cloud dataset. Our proposed method is compared against other methods, and the results show that it outperforms existing methods in semantic and instance segmentation and provides better results in target local area segmentation. Specifically, our proposed method outperforms PointNet (Qi et al., 2017a) by 16.0% and 26.3% in terms of semantic segmentation mIoU in S3DIS (Area 5) and instance segmentation mPre, respectively. Additionally, it surpasses ASIS (Wang et al., 2019) by 6.0% and 4.6%, respectively, as well as JSPNet (Chen et al., 2022) by a margin of 3.3% for semantic segmentation mIoU and a slight improvement of 0.3% for instance segmentation mPre.

翻译：室内三维点云数据的语义理解对于一系列后续应用至关重要，包括室内服务机器人、导航系统和数字孪生工程。全局特征对于实现室内点云的高质量语义与实例分割至关重要，因其提供了必要的长程上下文信息。为此，我们提出JSMNet，该方法结合多层网络与全局特征自注意力模块，联合分割三维点云的语义与实例。为更好表达室内目标特征，我们设计了多分辨率特征自适应融合模块，该模块考虑了扫描仪与目标距离变化导致的点云密度差异。此外，我们提出一种联合语义与实例分割框架，通过整合语义与实例特征以实现更优结果。我们在大型三维室内点云数据集S3DIS上进行实验。将所提方法与其他方法对比，结果表明其在语义与实例分割任务上优于现有方法，且在目标局部区域分割上表现更佳。具体而言，在S3DIS（区域5）上，所提方法在语义分割mIoU和实例分割mPre上分别较PointNet（Qi等，2017a）提升16.0%和26.3%；同时，分别较ASIS（Wang等，2019）提升6.0%和4.6%，并且较JSPNet（Chen等，2022）在语义分割mIoU上提升3.3%，在实例分割mPre上亦有0.3%的微弱改进。