Large-scale datasets with point-wise semantic and instance labels are crucial to 3D instance segmentation but also expensive. To leverage unlabeled data, previous semi-supervised 3D instance segmentation approaches have explored self-training frameworks, which rely on high-quality pseudo labels for consistency regularization. They intuitively utilize both instance and semantic pseudo labels in a joint learning manner. However, semantic pseudo labels contain numerous noise derived from the imbalanced category distribution and natural confusion of similar but distinct categories, which leads to severe collapses in self-training. Motivated by the observation that 3D instances are non-overlapping and spatially separable, we ask whether we can solely rely on instance consistency regularization for improved semi-supervised segmentation. To this end, we propose a novel self-training network InsTeacher3D to explore and exploit pure instance knowledge from unlabeled data. We first build a parallel base 3D instance segmentation model DKNet, which distinguishes each instance from the others via discriminative instance kernels without reliance on semantic segmentation. Based on DKNet, we further design a novel instance consistency regularization framework to generate and leverage high-quality instance pseudo labels. Experimental results on multiple large-scale datasets show that the InsTeacher3D significantly outperforms prior state-of-the-art semi-supervised approaches. Code is available: https://github.com/W1zheng/InsTeacher3D.
翻译:具备逐点语义与实例标注的大规模数据集对三维实例分割至关重要,但其标注成本高昂。为利用未标注数据,先前半监督三维实例分割方法探索了自训练框架,该框架依赖高质量伪标签进行一致性正则化。这些方法直观地以联合学习方式同时利用实例伪标签与语义伪标签。然而,语义伪标签包含大量源于类别分布不均衡及相似但不同类别间自然混淆所产生的噪声,这会导致自训练过程出现严重崩溃。基于三维实例互不重叠且空间可分离的观测,我们提出疑问:能否仅通过实例一致性正则化来改进半监督分割性能?为此,我们提出新型自训练网络InsTeacher3D,以从未标注数据中探索并利用纯净的实例知识。我们首先构建并行基础三维实例分割模型DKNet,该模型通过判别性实例核区分不同实例,无需依赖语义分割。基于DKNet,我们进一步设计新颖的实例一致性正则化框架来生成并利用高质量实例伪标签。在多个大规模数据集上的实验结果表明,InsTeacher3D显著优于现有最先进的半监督方法。代码已开源:https://github.com/W1zheng/InsTeacher3D。