Contemporary point cloud segmentation approaches largely rely on richly annotated 3D training data. However, it is both time-consuming and challenging to obtain consistently accurate annotations for such 3D scene data. Moreover, there is still a lack of investigation into fully unsupervised scene segmentation for point clouds, especially for holistic 3D scenes. This paper presents U3DS$^3$, as a step towards completely unsupervised point cloud segmentation for any holistic 3D scenes. To achieve this, U3DS$^3$ leverages a generalized unsupervised segmentation method for both object and background across both indoor and outdoor static 3D point clouds with no requirement for model pre-training, by leveraging only the inherent information of the point cloud to achieve full 3D scene segmentation. The initial step of our proposed approach involves generating superpoints based on the geometric characteristics of each scene. Subsequently, it undergoes a learning process through a spatial clustering-based methodology, followed by iterative training using pseudo-labels generated in accordance with the cluster centroids. Moreover, by leveraging the invariance and equivariance of the volumetric representations, we apply the geometric transformation on voxelized features to provide two sets of descriptors for robust representation learning. Finally, our evaluation provides state-of-the-art results on the ScanNet and SemanticKITTI, and competitive results on the S3DIS, benchmark datasets.
翻译:当前的点云分割方法在很大程度上依赖于带有丰富标注的3D训练数据。然而,为这类3D场景数据获取一致且精确的标注既耗时又具有挑战性。此外,针对点云的完全无监督场景分割(尤其是整体3D场景)的研究仍然匮乏。本文提出U3DS$^3$,旨在向任何整体3D场景的完全无监督点云分割迈出一步。为实现这一目标,U3DS$^3$利用一种通用的无监督分割方法,适用于室内外静态3D点云中的物体和背景,无需模型预训练,仅通过点云的固有信息即可完成完整的3D场景分割。该方法的第一步是基于每个场景的几何特征生成超点。随后,通过基于空间聚类的方法进行学习,并利用根据聚类中心生成的伪标签进行迭代训练。此外,利用体积表示的等变性与不变性,我们对体素化特征施加几何变换,以提供两组描述子,用于鲁棒的表示学习。最终,我们的评估在ScanNet和SemanticKITTI基准数据集上取得了最优结果,在S3DIS基准数据集上获得了具有竞争力的结果。