Despite significant advances, the performance of state-of-the-art continual learning approaches hinges on the unrealistic scenario of fully labeled data. In this paper, we tackle this challenge and propose an approach for continual semi-supervised learning--a setting where not all the data samples are labeled. A primary issue in this scenario is the model forgetting representations of unlabeled data and overfitting the labeled samples. We leverage the power of nearest-neighbor classifiers to nonlinearly partition the feature space and flexibly model the underlying data distribution thanks to its non-parametric nature. This enables the model to learn a strong representation for the current task, and distill relevant information from previous tasks. We perform a thorough experimental evaluation and show that our method outperforms all the existing approaches by large margins, setting a solid state of the art on the continual semi-supervised learning paradigm. For example, on CIFAR-100 we surpass several others even when using at least 30 times less supervision (0.8% vs. 25% of annotations). Finally, our method works well on both low and high resolution images and scales seamlessly to more complex datasets such as ImageNet-100. The code is publicly available on https://github.com/kangzhiq/NNCSL
翻译:尽管取得了显著进展,最先进的持续学习方法仍依赖于全标注数据这一不切实际的场景。本文针对这一挑战,提出了一种面向持续半监督学习(即并非所有数据样本都带有标注)的方法。该场景的核心问题在于模型会遗忘未标注数据的表征,并对已标注样本过拟合。我们利用最近邻分类器的能力来非线性划分特征空间,并凭借其非参数特性灵活建模底层数据分布。这使得模型能够为当前任务学习强表征,同时从先前任务中提取相关信息。通过详尽的实验评估,我们证明所提方法以大幅优势超越所有现有方法,为持续半监督学习范式奠定了坚实的业界最佳水平。例如,在CIFAR-100数据集上,即便使用至少30倍更少的监督信号(0.8% vs 25%的标注比例),我们的方法仍优于多个对比方法。最后,本方法在低分辨率与高分辨率图像上均表现优异,并能无缝扩展至ImageNet-100等更复杂的数据集。代码已开源发布在https://github.com/kangzhiq/NNCSL。