Deep-learning models for 3D point cloud semantic segmentation exhibit limited generalization capabilities when trained and tested on data captured with different sensors or in varying environments due to domain shift. Domain adaptation methods can be employed to mitigate this domain shift, for instance, by simulating sensor noise, developing domain-agnostic generators, or training point cloud completion networks. Often, these methods are tailored for range view maps or necessitate multi-modal input. In contrast, domain adaptation in the image domain can be executed through sample mixing, which emphasizes input data manipulation rather than employing distinct adaptation modules. In this study, we introduce compositional semantic mixing for point cloud domain adaptation, representing the first unsupervised domain adaptation technique for point cloud segmentation based on semantic and geometric sample mixing. We present a two-branch symmetric network architecture capable of concurrently processing point clouds from a source domain (e.g. synthetic) and point clouds from a target domain (e.g. real-world). Each branch operates within one domain by integrating selected data fragments from the other domain and utilizing semantic information derived from source labels and target (pseudo) labels. Additionally, our method can leverage a limited number of human point-level annotations (semi-supervised) to further enhance performance. We assess our approach in both synthetic-to-real and real-to-real scenarios using LiDAR datasets and demonstrate that it significantly outperforms state-of-the-art methods in both unsupervised and semi-supervised settings.
翻译:基于深度学习的3D点云语义分割模型在跨传感器或跨环境数据上训练和测试时,由于域差异的存在,其泛化能力受到限制。域适应方法可通过模拟传感器噪声、开发域无关生成器或训练点云补全网络来缓解该问题。然而这类方法通常针对距离视图地图设计或需多模态输入。与之相对,图像域的域适应可通过样本混合实现,其核心在于输入数据操作而非引入独立适应模块。本研究提出面向点云域适应的组合语义混合方法,这是首个基于语义与几何样本混合的无监督点云分割域适应技术。我们设计了一种双分支对称网络架构,可同时处理源域(如合成数据)与目标域(如真实数据)的点云。每个分支在其所属域内通过整合另一域的数据片段,并利用源域标签与目标域(伪)标签的语义信息进行运算。此外,该方法可借助少量人工点级标注(半监督)进一步提升性能。我们在合成→真实及真实→真实的场景下基于激光雷达数据集进行评估,结果表明该方法在无监督与半监督设置中均显著优于现有主流技术。