Densely annotating LiDAR point clouds is costly, which restrains the scalability of fully-supervised learning methods. In this work, we study the underexplored semi-supervised learning (SSL) in LiDAR segmentation. Our core idea is to leverage the strong spatial cues of LiDAR point clouds to better exploit unlabeled data. We propose LaserMix to mix laser beams from different LiDAR scans, and then encourage the model to make consistent and confident predictions before and after mixing. Our framework has three appealing properties: 1) Generic: LaserMix is agnostic to LiDAR representations (e.g., range view and voxel), and hence our SSL framework can be universally applied. 2) Statistically grounded: We provide a detailed analysis to theoretically explain the applicability of the proposed framework. 3) Effective: Comprehensive experimental analysis on popular LiDAR segmentation datasets (nuScenes, SemanticKITTI, and ScribbleKITTI) demonstrates our effectiveness and superiority. Notably, we achieve competitive results over fully-supervised counterparts with 2x to 5x fewer labels and improve the supervised-only baseline significantly by 10.8% on average. We hope this concise yet high-performing framework could facilitate future research in semi-supervised LiDAR segmentation. Code is publicly available.
翻译:密集标注激光雷达点云数据成本高昂,这限制了全监督学习方法的可扩展性。本文探索了激光雷达分割领域中尚未充分研究的半监督学习技术。核心思想在于利用激光雷达点云的强空间线索,更充分地挖掘未标注数据。我们提出LaserMix方法混合不同激光雷达扫描中的激光束,并激励模型在混合前后保持预测的一致性与置信度。该框架具备三大优势特征:1)通用性:LaserMix与激光雷达数据表示形式(如距离视图和体素)无关,因此所提半监督框架可被普遍应用;2)理论严谨性:通过详细分析从理论上阐释了所提框架的适用性;3)高效性:在主流激光雷达分割数据集(nuScenes、SemanticKITTI和ScribbleKITTI)上的全面实验分析验证了本方法的有效性与优越性。值得一提的是,使用全监督方法2至5倍的标注量即可取得具有竞争力的结果,并在监督基线基础上平均显著提升10.8%。我们期望这一简洁高效的框架能够促进半监督激光雷达分割领域的未来研究。代码已开源。