As a promising field in open-world learning, \textit{Novel Class Discovery} (NCD) is usually a task to cluster unseen novel classes in an unlabeled set based on the prior knowledge of labeled data within the same domain. However, the performance of existing NCD methods could be severely compromised when novel classes are sampled from a different distribution with the labeled ones. In this paper, we explore and establish the solvability of NCD in cross domain setting with the necessary condition that style information must be removed. Based on the theoretical analysis, we introduce an exclusive style removal module for extracting style information that is distinctive from the baseline features, thereby facilitating inference. Moreover, this module is easy to integrate with other NCD methods, acting as a plug-in to improve performance on novel classes with different distributions compared to the seen labeled set. Additionally, recognizing the non-negligible influence of different backbones and pre-training strategies on the performance of the NCD methods, we build a fair benchmark for future NCD research. Extensive experiments on three common datasets demonstrate the effectiveness of our proposed module.
翻译:作为开放世界学习中的一个前景广阔的领域,\textit{新类发现}(NCD)通常是一项任务,旨在基于同一域内已标记数据的先验知识,对未标记集中的未见新类进行聚类。然而,当新类样本与已标记样本来自不同分布时,现有NCD方法的性能可能会严重受损。在本文中,我们探索并确立了跨域设置下NCD的可解性,其必要条件是必须去除风格信息。基于理论分析,我们引入了一个排他性风格去除模块,用于提取与基线特征相区别的风格信息,从而促进推理。此外,该模块易于与其他NCD方法集成,作为一个即插即用的组件,以提升在与已见标记集分布不同的新类上的性能。另外,考虑到不同骨干网络和预训练策略对NCD方法性能不可忽视的影响,我们为未来的NCD研究建立了一个公平的基准。在三个常用数据集上的大量实验证明了我们提出的模块的有效性。