Class-incremental learning is becoming more popular as it helps models widen their applicability while not forgetting what they already know. A trend in this area is to use a mixture-of-expert technique, where different models work together to solve the task. However, the experts are usually trained all at once using whole task data, which makes them all prone to forgetting and increasing computational burden. To address this limitation, we introduce a novel approach named SEED. SEED selects only one, the most optimal expert for a considered task, and uses data from this task to fine-tune only this expert. For this purpose, each expert represents each class with a Gaussian distribution, and the optimal expert is selected based on the similarity of those distributions. Consequently, SEED increases diversity and heterogeneity within the experts while maintaining the high stability of this ensemble method. The extensive experiments demonstrate that SEED achieves state-of-the-art performance in exemplar-free settings across various scenarios, showing the potential of expert diversification through data in continual learning.
翻译:类增量学习日益受到关注,因为它帮助模型在拓展应用范围的同时不遗忘已有知识。该领域的一个趋势是采用混合专家技术,即不同模型协同解决问题。然而,现有专家通常使用完整任务数据同时训练,导致它们都容易发生遗忘并增加计算负担。为解决这一局限,我们提出名为SEED的新方法。SEED仅为当前任务选择一个最优专家,并使用该任务数据仅微调该专家。为此,每个专家用高斯分布表示每个类别,通过分布相似性选择最优专家。SEED在保持集成方法高稳定性的同时,增强了专家间的多样性与异质性。大量实验表明,在多种无样本场景下,SEED均达到最佳性能,展现了通过数据实现专家多样化在持续学习中的潜力。