Centralized training is the standard paradigm in deep learning, enabling models to learn from a unified dataset in a single location. In such setup, isotropic feature distributions naturally arise as a mean to support well-structured and generalizable representations. In contrast, continual learning operates on streaming and non-stationary data, and trains models incrementally, inherently facing the well-known plasticity-stability dilemma. In such settings, learning dynamics tends to yield increasingly anisotropic feature space. This arises a fundamental question: should isotropy be enforced to achieve a better balance between stability and plasticity, and thereby mitigate catastrophic forgetting? In this paper, we investigate whether promoting feature-space isotropy can enhance representation quality in continual learning. Through experiments using contrastive continual learning techniques on CIFAR-10 and CIFAR-100 data, we find that isotropic regularization fails to improve, and can in fact degrade, model accuracy in continual settings. Our results highlight essential differences in feature geometry between centralized and continual learning, suggesting that isotropy, while beneficial in centralized setups, may not constitute an appropriate inductive bias for non-stationary learning scenarios.
翻译:集中式训练是深度学习的标准范式,使得模型能够在单一位置从统一数据集中学习。在此设置下,各向同性的特征分布自然产生,作为支持结构良好且可泛化表示的一种手段。相比之下,持续学习处理流式和非平稳数据,并以增量方式训练模型,固有地面临众所周知的塑性-稳定性困境。在此类场景中,学习动态往往产生日益各向异性的特征空间。这引发了一个根本性问题:是否应强制各向同性以实现稳定性与塑性之间更好的平衡,从而缓解灾难性遗忘?本文研究了促进特征空间各向同性是否能够提升持续学习中的表示质量。通过在CIFAR-10和CIFAR-100数据上使用对比持续学习技术进行实验,我们发现各向同性正则化未能改善持续学习设置下的模型精度,甚至可能降低其性能。我们的结果凸显了集中式学习与持续学习在特征几何结构上的本质差异,表明各向同性虽然在集中式设置中具有优势,但可能不构成非平稳学习场景的合适归纳偏置。