This paper explores scene affinity (AIScene), namely intra-scene consistency and inter-scene correlation, for semi-supervised LiDAR semantic segmentation in driving scenes. Adopting teacher-student training, AIScene employs a teacher network to generate pseudo-labeled scenes from unlabeled data, which then supervise the student network's learning. Unlike most methods that include all points in pseudo-labeled scenes for forward propagation but only pseudo-labeled points for backpropagation, AIScene removes points without pseudo-labels, ensuring consistency in both forward and backward propagation within the scene. This simple point erasure strategy effectively prevents unsupervised, semantically ambiguous points (excluded in backpropagation) from affecting the learning of pseudo-labeled points. Moreover, AIScene incorporates patch-based data augmentation, mixing multiple scenes at both scene and instance levels. Compared to existing augmentation techniques that typically perform scene-level mixing between two scenes, our method enhances the semantic diversity of labeled (or pseudo-labeled) scenes, thereby improving the semi-supervised performance of segmentation models. Experiments show that AIScene outperforms previous methods on two popular benchmarks across four settings, achieving notable improvements of 1.9% and 2.1% in the most challenging 1% labeled data.
翻译:本文探索了驾驶场景中用于半监督激光雷达语义分割的场景亲和性(AIScene),即场景内一致性与场景间相关性。采用师生训练框架,AIScene利用教师网络从未标注数据中生成伪标注场景,进而指导学生网络的学习。与大多数方法在伪标注场景中保留所有点进行前向传播但仅使用伪标注点进行反向传播不同,AIScene移除了无伪标注的点,确保了场景内前向与反向传播的一致性。这种简单的点擦除策略有效防止了无监督的语义模糊点(在反向传播中被排除)影响伪标注点的学习。此外,AIScene结合了基于图像块的数据增强,在场景和实例级别混合多个场景。相较于现有通常仅在两场景间进行场景级混合的增强技术,我们的方法增强了标注(或伪标注)场景的语义多样性,从而提升了分割模型的半监督性能。实验表明,AIScene在两个主流基准测试的四种设置下均优于先前方法,在最具挑战性的1%标注数据设置中取得了1.9%和2.1%的显著提升。