Neural Collapse Terminus: A Unified Solution for Class Incremental Learning and Its Variants

How to enable learnability for new classes while keeping the capability well on old classes has been a crucial challenge for class incremental learning. Beyond the normal case, long-tail class incremental learning and few-shot class incremental learning are also proposed to consider the data imbalance and data scarcity, respectively, which are common in real-world implementations and further exacerbate the well-known problem of catastrophic forgetting. Existing methods are specifically proposed for one of the three tasks. In this paper, we offer a unified solution to the misalignment dilemma in the three tasks. Concretely, we propose neural collapse terminus that is a fixed structure with the maximal equiangular inter-class separation for the whole label space. It serves as a consistent target throughout the incremental training to avoid dividing the feature space incrementally. For CIL and LTCIL, we further propose a prototype evolving scheme to drive the backbone features into our neural collapse terminus smoothly. Our method also works for FSCIL with only minor adaptations. Theoretical analysis indicates that our method holds the neural collapse optimality in an incremental fashion regardless of data imbalance or data scarcity. We also design a generalized case where we do not know the total number of classes and whether the data distribution is normal, long-tail, or few-shot for each coming session, to test the generalizability of our method. Extensive experiments with multiple datasets are conducted to demonstrate the effectiveness of our unified solution to all the three tasks and the generalized case.

翻译：如何在不损害旧类能力的前提下使新类具有可学习性，一直是类增量学习的关键挑战。在常规情况之外，长尾类增量学习和少样本类增量学习分别考虑了数据不平衡和数据稀缺问题——这些在现实应用中普遍存在并进一步加剧了众所周知的灾难性遗忘问题。现有方法仅针对这三类任务中的某一项而专门设计。本文针对三类任务中的对齐困境提出了统一解决方案。具体而言，我们提出神经崩溃终点——一种针对整个标签空间具有最大等角类间间隔的固定结构。它作为增量训练过程中的一致目标，避免了对特征空间进行增量划分。针对CIL和LTCIL，我们进一步提出原型演化方案，使主干网络特征平滑收敛至神经崩溃终点。我们的方法仅需微小调整即可适用于FSCIL。理论分析表明，无论数据不平衡或数据稀缺，我们的方法都能以增量方式保持神经崩溃最优性。我们还设计了一个泛化场景——未知每轮任务的总类别数以及数据分布属于正态、长尾还是少样本——以测试方法的泛化能力。在多个数据集上的大量实验证明，我们的统一解决方案对三类任务及泛化场景均具有效性。