Traditional semi-supervised learning tasks assume that both labeled and unlabeled data follow the same class distribution, but the realistic open-world scenarios are of more complexity with unknown novel classes mixed in the unlabeled set. Therefore, it is of great challenge to not only recognize samples from known classes but also discover the unknown number of novel classes within the unlabeled data. In this paper, we introduce a new open-world semi-supervised novel class discovery approach named OpenNCD, a progressive bi-level contrastive learning method over multiple prototypes. The proposed method is composed of two reciprocally enhanced parts. First, a bi-level contrastive learning method is introduced, which maintains the pair-wise similarity of the prototypes and the prototype group levels for better representation learning. Then, a reliable prototype similarity metric is proposed based on the common representing instances. Prototypes with high similarities will be grouped progressively for known class recognition and novel class discovery. Extensive experiments on three image datasets are conducted and the results show the effectiveness of the proposed method in open-world scenarios, especially with scarce known classes and labels.
翻译:传统半监督学习任务假设标记数据与未标记数据遵循相同的类别分布,但现实开放世界场景更为复杂,未标记数据中混杂着未知的新类别。因此,不仅需要识别已知类别的样本,还需在未标记数据中自主发现未知数量的新类别,这构成了巨大挑战。本文提出一种名为OpenNCD的全新开放世界半监督新类发现方法,这是一种基于多原型的渐进式双层对比学习方法。该方法由两个相互增强的模块组成:首先,引入双层对比学习机制,通过保持原型间及原型组级别的成对相似性来优化表征学习;其次,基于共同表征实例提出可靠的原型相似度度量,高相似度的原型将被逐步聚合,以完成已知类识别与新类发现。在三个图像数据集上的大量实验表明,该方法在开放世界场景下(尤其是已知类别及标注样本稀缺时)具有显著有效性。