Unsupervised domain adaptation (UDA) in 3D segmentation tasks presents a formidable challenge, primarily stemming from the sparse and unordered nature of point cloud data. Especially for LiDAR point clouds, the domain discrepancy becomes obvious across varying capture scenes, fluctuating weather conditions, and the diverse array of LiDAR devices in use. While previous UDA methodologies have often sought to mitigate this gap by aligning features between source and target domains, this approach falls short when applied to 3D segmentation due to the substantial domain variations. Inspired by the remarkable generalization capabilities exhibited by the vision foundation model, SAM, in the realm of image segmentation, our approach leverages the wealth of general knowledge embedded within SAM to unify feature representations across diverse 3D domains and further solves the 3D domain adaptation problem. Specifically, we harness the corresponding images associated with point clouds to facilitate knowledge transfer and propose an innovative hybrid feature augmentation methodology, which significantly enhances the alignment between the 3D feature space and SAM's feature space, operating at both the scene and instance levels. Our method is evaluated on many widely-recognized datasets and achieves state-of-the-art performance.
翻译:无监督域适应在三维分割任务中面临严峻挑战,主要源于点云数据的稀疏性和无序性。特别是对于激光雷达点云,不同采集场景、多变天气条件以及众多激光雷达设备间的域差异尤为显著。尽管先前无监督域适应方法常通过源域与目标域特征对齐来缩小差异,但由于三维分割中域变化幅度过大,该方法效果有限。受视觉基础模型SAM在图像分割领域展现的卓越泛化能力启发,本文利用SAM蕴含的通用知识统一不同三维域的特征表示,进而解决三维域适应问题。具体而言,我们借助与点云对应的图像实现知识迁移,并提出创新性的混合特征增强方法,该方法在场景级和实例级两个层面显著提升三维特征空间与SAM特征空间的对齐效果。在多个广泛认可的基准数据集上的评估表明,本方法达到了当前最优性能。