LiDAR-based 3D perception algorithms have evolved rapidly alongside the emergence of large datasets. Nonetheless, considerable performance degradation often ensues when models trained on a specific dataset are applied to other datasets or real-world scenarios with different LiDAR. This paper aims to develop a unified model capable of handling different LiDARs, enabling continual learning across diverse LiDAR datasets and seamless deployment across heterogeneous platforms. We observe that the gaps among datasets primarily manifest in geometric disparities (such as variations in beams and point counts) and semantic inconsistencies (taxonomy conflicts). To this end, this paper proposes UniLiDAR, an occupancy prediction pipeline that leverages geometric realignment and semantic label mapping to facilitate multiple datasets training and mitigate performance degradation during deployment on heterogeneous platforms. Moreover, our method can be easily combined with existing 3D perception models. The efficacy of the proposed approach in bridging LiDAR domain gaps is verified by comprehensive experiments on two prominent datasets: OpenOccupancy-nuScenes and SemanticKITTI. UniLiDAR elevates the mIoU of occupancy prediction by 15.7% and 12.5%, respectively, compared to the model trained on the directly merged dataset. Moreover, it outperforms several SOTA methods trained on individual datasets. We expect our research to facilitate further study of 3D generalization, the code will be available soon.
翻译:基于激光雷达的3D感知算法随着大规模数据集的出现而迅速发展。然而,当针对特定数据集训练的模型应用于其他数据集或采用不同激光雷达的真实场景时,性能往往会显著下降。本文旨在开发一个能够处理不同激光雷达的统一模型,实现跨多种激光雷达数据集的持续学习,并能在异构平台上无缝部署。我们观察到,数据集之间的差异主要体现在几何差异(如光束数和点数的变化)和语义不一致性(分类体系冲突)上。为此,本文提出UniLiDAR,一种基于占据预测的流程,通过几何重对齐和语义标签映射来支持多数据集训练,并减轻异构平台部署时的性能下降。此外,我们的方法可以轻松与现有的3D感知模型结合。通过在两个主流数据集——OpenOccupancy-nuScenes和SemanticKITTI——上的全面实验,验证了所提方法在弥合激光雷达领域差异方面的有效性。与直接在合并数据集上训练的模型相比,UniLiDAR将占据预测的mIoU分别提升了15.7%和12.5%。此外,它优于多个在单一数据集上训练的SOTA方法。我们期望本研究能促进3D泛化领域的进一步探索,代码将很快公开。