Using deep learning, 3D autonomous driving semantic segmentation has become a well-studied subject, with methods that can reach very high performance. Nonetheless, because of the limited size of the training datasets, these models cannot see every type of object and scene found in real-world applications. The ability to be reliable in these various unknown environments is called domain generalization. Despite its importance, domain generalization is relatively unexplored in the case of 3D autonomous driving semantic segmentation. To fill this gap, this paper presents the first benchmark for this application by testing state-of-the-art methods and discussing the difficulty of tackling Laser Imaging Detection and Ranging (LiDAR) domain shifts. We also propose the first method designed to address this domain generalization, which we call 3DLabelProp. This method relies on leveraging the geometry and sequentiality of the LiDAR data to enhance its generalization performances by working on partially accumulated point clouds. It reaches a mean Intersection over Union (mIoU) of 50.4% on SemanticPOSS and of 55.2% on PandaSet solid-state LiDAR while being trained only on SemanticKITTI, making it the state-of-the-art method for generalization (+5% and +33% better, respectively, than the second best method). The code for this method will be available on GitHub.
翻译:基于深度学习的3D自动驾驶语义分割已成为研究成熟的课题,现有方法可达到极高性能。然而由于训练数据集规模有限,这些模型无法覆盖实际应用中的所有物体类型与场景。在各类未知环境中保持可靠性的能力被称为域泛化。尽管其重要性显著,但针对3D自动驾驶语义分割的域泛化研究仍相对匮乏。为填补这一空白,本文首次构建了该场景下的基准测试,通过评测现有最优方法并探讨应对激光雷达(LiDAR)域偏移的难点。我们同时提出了首个面向该域泛化问题的专用方法——3DLabelProp。该方法通过利用LiDAR数据的几何特性与序列性,在部分累积点云上操作以提升泛化性能。仅基于SemanticKITTI数据集训练时,该方法在SemanticPOSS数据集上达到50.4%的平均交并比(mIoU),在PandaSet固态激光雷达数据集上达到55.2%的mIoU,成为泛化性能的当前最优方法(分别领先第二名方法5%和33%)。相关代码将在GitHub开源。