Deep-learning models for 3D point cloud semantic segmentation exhibit limited generalization capabilities when trained and tested on data captured with different sensors or in varying environments due to domain shift. Domain adaptation methods can be employed to mitigate this domain shift, for instance, by simulating sensor noise, developing domain-agnostic generators, or training point cloud completion networks. Often, these methods are tailored for range view maps or necessitate multi-modal input. In contrast, domain adaptation in the image domain can be executed through sample mixing, which emphasizes input data manipulation rather than employing distinct adaptation modules. In this study, we introduce compositional semantic mixing for point cloud domain adaptation, representing the first unsupervised domain adaptation technique for point cloud segmentation based on semantic and geometric sample mixing. We present a two-branch symmetric network architecture capable of concurrently processing point clouds from a source domain (e.g. synthetic) and point clouds from a target domain (e.g. real-world). Each branch operates within one domain by integrating selected data fragments from the other domain and utilizing semantic information derived from source labels and target (pseudo) labels. Additionally, our method can leverage a limited number of human point-level annotations (semi-supervised) to further enhance performance. We assess our approach in both synthetic-to-real and real-to-real scenarios using LiDAR datasets and demonstrate that it significantly outperforms state-of-the-art methods in both unsupervised and semi-supervised settings.
翻译:基于深度学习的3D点云语义分割模型在跨传感器或不同环境数据训练与测试时,由于域偏移导致泛化能力受限。现有域自适应方法可通过模拟传感器噪声、开发域无关生成器或训练点云补全网络来缓解此类偏移,但常需针对测距图视图设计或依赖多模态输入。相比之下,图像域的域自适应可通过样本混合实现,其强调输入数据处理而非采用独立自适应模块。本研究提出基于组合语义混合的点云域自适应方法,这是首个基于语义与几何样本混合的无监督点云分割域自适应技术。我们设计了一种双分支对称网络架构,可同时处理源域(如合成数据)与目标域(如真实数据)的点云数据。每个分支通过整合另一域的选定数据片段,并利用源域标签与目标域(伪)标签中的语义信息实现域内操作。此外,该方法可利用少量人工点级标注(半监督)进一步提升性能。我们在激光雷达数据集上评估了合成到真实及真实到真实场景的迁移效果,证明该方法在无监督与半监督设置下均显著优于现有最优方法。