We tackle the problem of Continual Category Discovery (CCD), which aims to automatically discover novel categories in a continuous stream of unlabeled data while mitigating the challenge of catastrophic forgetting -- an open problem that persists even in conventional, fully supervised continual learning. To address this challenge, we propose PromptCCD, a simple yet effective framework that utilizes a Gaussian Mixture Model (GMM) as a prompting method for CCD. At the core of PromptCCD lies the Gaussian Mixture Prompting (GMP) module, which acts as a dynamic pool that updates over time to facilitate representation learning and prevent forgetting during category discovery. Moreover, GMP enables on-the-fly estimation of category numbers, allowing PromptCCD to discover categories in unlabeled data without prior knowledge of the category numbers. We extend the standard evaluation metric for Generalized Category Discovery (GCD) to CCD and benchmark state-of-the-art methods on diverse public datasets. PromptCCD significantly outperforms existing methods, demonstrating its effectiveness. Project page: https://visual-ai.github.io/promptccd .
翻译:我们致力于解决持续类别发现(CCD)问题,该问题旨在连续的无标注数据流中自动发现新类别,同时缓解灾难性遗忘的挑战——即使在传统的、完全监督的持续学习中,这仍然是一个悬而未决的问题。为应对这一挑战,我们提出了PromptCCD,这是一个简单而有效的框架,它利用高斯混合模型(GMM)作为CCD的提示方法。PromptCCD的核心是高斯混合提示(GMP)模块,该模块作为一个动态池,随时间更新以促进表示学习,并在类别发现过程中防止遗忘。此外,GMP能够实时估计类别数量,使得PromptCCD能够在无先验类别数量知识的情况下,从无标注数据中发现类别。我们将广义类别发现(GCD)的标准评估指标扩展至CCD,并在多个公共数据集上对现有最先进方法进行了基准测试。PromptCCD显著优于现有方法,证明了其有效性。项目页面:https://visual-ai.github.io/promptccd。