Open-set semi-supervised learning (OSSL) embodies a practical scenario within semi-supervised learning, wherein the unlabeled training set encompasses classes absent from the labeled set. Many existing OSSL methods assume that these out-of-distribution data are harmful and put effort into excluding data belonging to unknown classes from the training objective. In contrast, we propose an OSSL framework that facilitates learning from all unlabeled data through self-supervision. Additionally, we utilize an energy-based score to accurately recognize data belonging to the known classes, making our method well-suited for handling uncurated data in deployment. We show through extensive experimental evaluations that our method yields state-of-the-art results on many of the evaluated benchmark problems in terms of closed-set accuracy and open-set recognition when compared with existing methods for OSSL. Our code is available at https://github.com/walline/ssl-tf2-sefoss.
翻译:开放集半监督学习(OSSL)体现了半监督学习中的一种实际场景,其中未标记训练集包含标记集中不存在的类别。许多现有的OSSL方法假设这些分布外数据是有害的,并致力于在训练目标中排除属于未知类别的数据。与此相反,我们提出了一种通过自监督学习促进从所有未标记数据中学习的OSSL框架。此外,我们利用基于能量的分数准确识别属于已知类别的数据,使我们的方法非常适合处理部署中的未整理数据。通过广泛的实验评估,我们证明,与现有的OSSL方法相比,我们的方法在许多评估基准问题上在闭集准确率和开放集识别方面均取得了最先进的结果。我们的代码可在 https://github.com/walline/ssl-tf2-sefoss 获取。