Automatic pseudo-labeling is a powerful tool to tap into large amounts of sequential unlabeled data. It is specially appealing in safety-critical applications of autonomous driving, where performance requirements are extreme, datasets are large, and manual labeling is very challenging. We propose to leverage sequences of point clouds to boost the pseudolabeling technique in a teacher-student setup via training multiple teachers, each with access to different temporal information. This set of teachers, dubbed Concordance, provides higher quality pseudo-labels for student training than standard methods. The output of multiple teachers is combined via a novel pseudo label confidence-guided criterion. Our experimental evaluation focuses on the 3D point cloud domain and urban driving scenarios. We show the performance of our method applied to 3D semantic segmentation and 3D object detection on three benchmark datasets. Our approach, which uses only 20% manual labels, outperforms some fully supervised methods. A notable performance boost is achieved for classes rarely appearing in training data.
翻译:自动伪标签生成是挖掘海量序列无标注数据的强大工具,在安全攸关的自动驾驶场景中尤为重要——这类应用对性能要求严苛、数据集规模庞大且人工标注极其困难。我们提出通过训练多个教师模型来利用点云序列提升师生框架下的伪标签技术,每个教师模型可访问不同时间维度的信息。该教师模型集合(称为"共识"模型)能为学生模型训练提供比标准方法更高质量的伪标签。我们通过一种新颖的伪标签置信度引导准则,实现对多个教师模型输出的融合。实验评估聚焦三维点云领域及城市驾驶场景,在三个基准数据集上展示了本方法在三维语义分割和三维目标检测任务中的表现。仅使用20%人工标注的我们的方法,其性能超越了部分全监督方法,尤其对训练数据中罕见类别的性能提升显著。