LiDAR-based 3D detection has made great progress in recent years. However, the performance of 3D detectors is considerably limited when deployed in unseen environments, owing to the severe domain gap problem. Existing domain adaptive 3D detection methods do not adequately consider the problem of the distributional discrepancy in feature space, thereby hindering generalization of detectors across domains. In this work, we propose a novel unsupervised domain adaptive \textbf{3D} detection framework, namely \textbf{G}eometry-aware \textbf{P}rototype \textbf{A}lignment (\textbf{GPA-3D}), which explicitly leverages the intrinsic geometric relationship from point cloud objects to reduce the feature discrepancy, thus facilitating cross-domain transferring. Specifically, GPA-3D assigns a series of tailored and learnable prototypes to point cloud objects with distinct geometric structures. Each prototype aligns BEV (bird's-eye-view) features derived from corresponding point cloud objects on source and target domains, reducing the distributional discrepancy and achieving better adaptation. The evaluation results obtained on various benchmarks, including Waymo, nuScenes and KITTI, demonstrate the superiority of our GPA-3D over the state-of-the-art approaches for different adaptation scenarios. The MindSpore version code will be publicly available at \url{https://github.com/Liz66666/GPA3D}.
翻译:基于激光雷达的三维检测近年来取得了重大进展。然而,由于严重的域差距问题,三维检测器在未知环境下的性能受到显著限制。现有域自适应三维检测方法未充分考量特征空间分布差异问题,从而阻碍了检测器跨域泛化能力。本文提出一种新颖的无监督域自适应三维检测框架,即几何感知原型对齐(GPA-3D),该框架显式利用点云目标的内在几何关系来减少特征差异,从而促进跨域迁移。具体而言,GPA-3D为具有不同几何结构的点云目标分配一系列定制且可学习的原型。每个原型在源域和目标域上对齐对应点云目标衍生的BEV(鸟瞰图)特征,从而减少分布差异并实现更好的自适应。在包含Waymo、nuScenes和KITTI的多类基准数据集上的评估结果表明,我们的GPA-3D在不同自适应场景下均优于现有最先进方法。MindSpore版本代码将在\url{https://github.com/Liz66666/GPA3D} 开源。