Open-set semi-supervised learning (OSSL) is a realistic setting of semi-supervised learning where the unlabeled training set contains classes that are not present in the labeled set. Many existing OSSL methods assume that these out-of-distribution data are harmful and put effort into excluding data from unknown classes from the training objective. In contrast, we propose an OSSL framework that facilitates learning from all unlabeled data through self-supervision. Additionally, we utilize an energy-based score to accurately recognize data belonging to the known classes, making our method well-suited for handling uncurated data in deployment. We show through extensive experimental evaluations on several datasets that our method shows overall unmatched robustness and performance in terms of closed-set accuracy and open-set recognition compared with state-of-the-art for OSSL. Our code will be released upon publication.
翻译:开放集半监督学习(OSSL)是半监督学习中一种现实设定,其中未标记训练集包含标记集中不存在的类别。许多现有OSSL方法假设这些分布外数据有害,并致力于将未知类别数据排除在训练目标之外。相比之下,我们提出了一种通过自监督学习充分利用所有未标记数据的OSSL框架。此外,我们利用基于能量的得分准确识别已知类别的数据,使我们的方法特别适合处理部署中的非精选数据。通过在多个数据集上的广泛实验评估,我们证明了与现有最先进的OSSL方法相比,我们的方法在闭集准确率和开放集识别方面具有整体无与伦比的鲁棒性和性能。我们的代码将在发表后发布。