LiDAR-based 3D detectors need large datasets for training, yet they struggle to generalize to novel domains. Domain Generalization (DG) aims to mitigate this by training detectors that are invariant to such domain shifts. Current DG approaches exclusively rely on global geometric features (point cloud Cartesian coordinates) as input features. Over-reliance on these global geometric features can, however, cause 3D detectors to prioritize object location and absolute position, resulting in poor cross-domain performance. To mitigate this, we propose to exploit explicit local point cloud structure for DG, in particular by encoding point cloud neighborhoods with Gaussian blobs, GBlobs. Our proposed formulation is highly efficient and requires no additional parameters. Without any bells and whistles, simply by integrating GBlobs in existing detectors, we beat the current state-of-the-art in challenging single-source DG benchmarks by over 21 mAP (Waymo->KITTI), 13 mAP (KITTI->Waymo), and 12 mAP (nuScenes->KITTI), without sacrificing in-domain performance. Additionally, GBlobs demonstrate exceptional performance in multi-source DG, surpassing the current state-of-the-art by 17, 12, and 5 mAP on Waymo, KITTI, and ONCE, respectively.
翻译:基于激光雷达的三维检测器需要大量数据集进行训练,但在泛化至新域时仍面临困难。领域泛化旨在通过训练对领域偏移保持不变的检测器来缓解此问题。当前的领域泛化方法仅依赖全局几何特征(点云笛卡尔坐标)作为输入特征。然而,过度依赖这些全局几何特征可能导致三维检测器过度关注物体位置和绝对坐标,从而造成跨域性能下降。为改善此问题,我们提出利用显式的局部点云结构进行领域泛化,特别是通过高斯斑点(GBlobs)对点云邻域进行编码。我们提出的方法具有高效性且无需额外参数。无需任何复杂技巧,仅需将GBlobs集成到现有检测器中,我们就在具有挑战性的单源领域泛化基准测试中显著超越当前最优方法:在Waymo->KITTI上提升超过21 mAP,在KITTI->Waymo上提升13 mAP,在nuScenes->KITTI上提升12 mAP,同时保持域内性能不受影响。此外,GBlobs在多源领域泛化任务中展现出卓越性能,分别在Waymo、KITTI和ONCE数据集上以17、12和5 mAP的优势超越当前最优方法。