Novel Class Discovery (NCD) aims to discover unknown and novel classes in an unlabeled set by leveraging knowledge already learned about known classes. Existing works focus on instance-level or class-level knowledge representation and build a shared representation space to achieve performance improvements. However, a long-neglected issue is the potential imbalanced number of samples from known and novel classes, pushing the model towards dominant classes. Therefore, these methods suffer from a challenging trade-off between reviewing known classes and discovering novel classes. Based on this observation, we propose a Self-Cooperation Knowledge Distillation (SCKD) method to utilize each training sample (whether known or novel, labeled or unlabeled) for both review and discovery. Specifically, the model's feature representations of known and novel classes are used to construct two disjoint representation spaces. Through spatial mutual information, we design a self-cooperation learning to encourage model learning from the two feature representation spaces from itself. Extensive experiments on six datasets demonstrate that our method can achieve significant performance improvements, achieving state-of-the-art performance.
翻译:新类发现(NCD)旨在通过利用已学习的已知类知识,在未标记数据集中发现未知的新类别。现有工作侧重于实例级或类级知识表示,并构建共享表示空间以实现性能提升。然而,一个长期被忽视的问题是已知类与新类样本数量可能存在不平衡,这会推动模型偏向主导类别。因此,这些方法在回顾已知类与发现新类之间面临困难的权衡。基于此观察,我们提出了一种自协作知识蒸馏(SCKD)方法,利用每个训练样本(无论是已知类或新类、已标记或未标记)同时进行回顾与发现。具体而言,模型对已知类和新类的特征表示被用于构建两个互斥的表示空间。通过空间互信息,我们设计了一种自协作学习机制,鼓励模型从自身的两个特征表示空间中学习。在六个数据集上的大量实验表明,我们的方法能够实现显著的性能提升,达到最先进的性能水平。