Open-world semi-supervised learning (OWSSL) extends conventional semi-supervised learning to open-world scenarios by taking account of novel categories in unlabeled datasets. Despite the recent advancements in OWSSL, the success often relies on the assumptions that 1) labeled and unlabeled datasets share the same balanced class prior distribution, which does not generally hold in real-world applications, and 2) unlabeled training datasets are utilized for evaluation, where such transductive inference might not adequately address challenges in the wild. In this paper, we aim to generalize OWSSL by addressing them. Our work suggests that practical OWSSL may require different training settings, evaluation methods, and learning strategies compared to those prevalent in the existing literature.
翻译:开放世界半监督学习(OWSSL)通过考虑未标记数据集中存在的新类别,将传统半监督学习扩展至开放世界场景。尽管OWSSL近期取得了进展,但其成功往往依赖于以下假设:1)标记与未标记数据集共享相同的平衡类别先验分布——这一假设在实际应用中通常不成立;2)使用未标记训练数据集进行评估,这种转导推理方式可能无法充分应对真实场景中的挑战。本文旨在通过解决这些问题来推广OWSSL。我们的研究表明,与现有文献中普遍采用的方法相比,实用的OWSSL可能需要不同的训练设置、评估策略与学习机制。