Most of Continual Learning (CL) methods push the limit of supervised learning settings, where an agent is expected to learn new labeled tasks and not forget previous knowledge. However, these settings are not well aligned with real-life scenarios, where a learning agent has access to a vast amount of unlabeled data encompassing both novel (entirely unlabeled) classes and examples from known classes. Drawing inspiration from Generalized Category Discovery (GCD), we introduce a novel framework that relaxes this assumption. Precisely, in any task, we allow for the existence of novel and known classes, and one must use continual version of unsupervised learning methods to discover them. We call this setting Generalized Continual Category Discovery (GCCD). It unifies CL and GCD, bridging the gap between synthetic benchmarks and real-life scenarios. With a series of experiments, we present that existing methods fail to accumulate knowledge from subsequent tasks in which unlabeled samples of novel classes are present. In light of these limitations, we propose a method that incorporates both supervised and unsupervised signals and mitigates the forgetting through the use of centroid adaptation. Our method surpasses strong CL methods adopted for GCD techniques and presents a superior representation learning performance.
翻译:大多数持续学习(CL)方法推动了有监督学习设置的边界,在这种设置中,智能体需要学习新的有标签任务且不遗忘先前知识。然而,这些设置与现实场景并不完全一致,在现实场景中,学习智能体能够访问包含全新类别(完全无标签)和已知类别样本的海量无标签数据。受广义类别发现(GCD)启发,我们引入了一个放宽这一假设的新框架。具体而言,在任何任务中,我们允许存在新类别和已知类别,并且必须使用持续版本的无监督学习方法对其进行发现。我们将这一设置称为广义持续类别发现(GCCD)。它统一了CL与GCD,弥合了合成基准与现实场景之间的差距。通过一系列实验,我们发现现有方法无法从后续包含新类别无标签样本的任务中积累知识。鉴于这些局限性,我们提出了一种融合有监督信号与无监督信号的方法,并通过质心自适应来缓解遗忘问题。我们的方法超越了专门为GCD技术设计的强CL方法,并展现出更优的表征学习性能。