Real-world datasets often exhibit long-tailed distributions, where a few dominant "Head" classes have abundant samples while most "Tail" classes are severely underrepresented, leading to biased learning and poor generalization for the Tail. We present a theoretical framework that reveals a previously undescribed connection between Long-Tailed Recognition (LTR) and Continual Learning (CL), the process of learning sequential tasks without forgetting prior knowledge. Our analysis demonstrates that, for models trained on imbalanced datasets, the weights converge to a bounded neighborhood of those trained exclusively on the Head, with the bound scaling as the inverse square root of the imbalance factor. Leveraging this insight, we introduce Continual Learning for Long-Tailed Recognition (CLTR), a principled approach that employs standard off-the-shelf CL methods to address LTR problems by sequentially learning Head and Tail classes without forgetting the Head. Our theoretical analysis further suggests that CLTR mitigates gradient saturation and improves Tail learning while maintaining strong Head performance. Extensive experiments on CIFAR100-LT, CIFAR10-LT, ImageNet-LT, and Caltech256 validate our theoretical predictions, achieving strong results across various LTR benchmarks. Our work bridges the gap between LTR and CL, providing a principled way to tackle imbalanced data challenges with standard existing CL strategies.
翻译:现实世界的数据集常呈现长尾分布特征:少数主导的"头部"类别拥有充足样本,而多数"尾部"类别样本严重不足,导致模型学习产生偏差且对尾部类别的泛化能力较弱。本文提出一个理论框架,揭示了长尾识别与持续学习之间尚未被描述的内在联系——后者指在不遗忘先前知识的前提下顺序学习多个任务的过程。我们的分析表明,在非平衡数据集上训练的模型,其权重会收敛至仅用头部数据训练所得权重的有界邻域内,该边界尺度与不平衡因子的平方根倒数成正比。基于这一发现,我们提出了面向长尾识别的持续学习方法,该方法通过顺序学习头部与尾部类别(同时保持对头部类别的记忆)来系统解决长尾识别问题。理论分析进一步表明,CLTR 能缓解梯度饱和现象,在保持头部性能的同时提升尾部类别的学习效果。在 CIFAR100-LT、CIFAR10-LT、ImageNet-LT 和 Caltech256 数据集上的大量实验验证了我们的理论预测,并在多个长尾识别基准测试中取得了优异结果。本研究弥合了长尾识别与持续学习领域的理论间隙,为利用现有标准持续学习策略解决数据不平衡问题提供了理论依据。