An Empirical Study of the Generalization Ability of Lidar 3D Object Detectors to Unseen Domains

3D Object Detectors (3D-OD) are crucial for understanding the environment in many robotic tasks, especially autonomous driving. Including 3D information via Lidar sensors improves accuracy greatly. However, such detectors perform poorly on domains they were not trained on, i.e. different locations, sensors, weather, etc., limiting their reliability in safety-critical applications. There exist methods to adapt 3D-ODs to these domains; however, these methods treat 3D-ODs as a black box, neglecting underlying architectural decisions and source-domain training strategies. Instead, we dive deep into the details of 3D-ODs, focusing our efforts on fundamental factors that influence robustness prior to domain adaptation. We systematically investigate four design choices (and the interplay between them) often overlooked in 3D-OD robustness and domain adaptation: architecture, voxel encoding, data augmentations, and anchor strategies. We assess their impact on the robustness of nine state-of-the-art 3D-ODs across six benchmarks encompassing three types of domain gaps - sensor type, weather, and location. Our main findings are: (1) transformer backbones with local point features are more robust than 3D CNNs, (2) test-time anchor size adjustment is crucial for adaptation across geographical locations, significantly boosting scores without retraining, (3) source-domain augmentations allow the model to generalize to low-resolution sensors, and (4) surprisingly, robustness to bad weather is improved when training directly on more clean weather data than on training with bad weather data. We outline our main conclusions and findings to provide practical guidance on developing more robust 3D-ODs.

翻译：三维目标检测器（3D-OD）在诸多机器人任务（尤其是自动驾驶）的环境感知中至关重要。通过激光雷达传感器引入三维信息显著提升了检测精度。然而，这类检测器在未经训练的场景域（如不同地点、传感器、天气等）中表现不佳，限制了其在安全关键应用中的可靠性。现有方法虽能实现3D-OD对这些域的适应，但将其视为黑箱，忽略了底层架构决策和源域训练策略。为此，我们深入剖析3D-OD的细节，聚焦于域适应前影响鲁棒性的基础因素。我们系统研究了在3D-OD鲁棒性与域适应中常被忽视的四种设计选择及其交互作用：架构、体素编码、数据增强和锚点策略。我们评估了这些因素对九种主流3D-OD在涵盖三类域差距（传感器类型、天气、地理位置）的六个基准上的鲁棒性影响。主要发现如下：（1）采用局部点特征的Transformer主干网络比三维卷积神经网络（3D CNN）更具鲁棒性；（2）测试阶段锚点尺寸调整对跨地理位置的适应性至关重要，可在无需重新训练的情况下显著提升评分；（3）源域数据增强使模型能够泛化至低分辨率传感器；（4）令人意外的是，直接使用更晴朗天气的数据训练比使用恶劣天气数据训练更能提升恶劣天气下的鲁棒性。我们总结了主要结论与发现，为开发更鲁棒的3D-OD提供实践指导。