In a continual learning setting, we require a model to be plastic enough to learn a new task and stable enough to not disturb previously learned capabilities. We argue that this dilemma has an architectural root. A finite network has limited representational and plastic resources, yet the required capacity depends on properties of the future task stream that are unknown: how many tasks will be encountered, and how much they overlap in feature space. Regularization-based methods preserve past knowledge within fixed-capacity architectures and therefore implicitly rely on an oracle architecture sized for this unknown future. When tasks are only weakly related, fixed architectures progressively run out of plastic resources; when tasks are few or strongly overlapping, models are often over-provisioned. Inspired by neurogenesis in biology, we propose NORACL to address the stability-plasticity dilemma by tackling the oracle architecture problem through neuronal growth. Starting from a compact network, NORACL grows only when needed by monitoring two complementary signals for representational and plasticity saturation. We evaluate NORACL against oracle-sized static baselines across varying task counts and geometries. Across all settings, NORACL achieves final average accuracies that are better than or on par with oracle-provisioned static baselines while using fewer parameters. Additionally, NORACL yields architectures with interpretable growth, i.e. dissimilar tasks predominantly expand feature-extraction layers, whereas tasks which rely on common features shift growth toward later feature-combination layers. Our analysis further explains why fixed-capacity networks lose plasticity as tasks accumulate, whereas NORACL creates fresh capacity for new tasks through growth. Together, these results show that adaptive neurogenesis pushes the stability-plasticity Pareto frontier of continual learning.
翻译:在持续学习场景中,模型需要具备足够可塑性以学习新任务,同时保持稳定性以避免干扰已掌握的知识。我们认为这一困境具有架构根源:固定容量的网络存在表达能力与可塑性资源的局限性,然而所需容量取决于未知的未来任务流特性——包括待处理任务数量及它们在特征空间中的重叠程度。基于正则化的方法在固定容量架构中保存历史知识,本质上依赖于为未知未来任务预设容量的"先验架构"。当任务间关联较弱时,固定架构会逐步耗尽可塑性资源;当任务数量较少或高度重叠时,模型又常处于过度配置状态。受生物神经发生机制启发,我们提出NORACL方法,通过神经元生长来解决先验架构问题,从而应对稳定性-可塑性困境。该方法从紧凑网络出发,通过监测表征饱和与可塑性饱和的双重互补信号,仅在必要时触发网络生长。我们针对不同任务数量和几何关系场景,将NORACL与先验容量已知的静态基线模型进行对比。在所有设置下,NORACL在使用更少参数的同时,最终平均准确率均达到或优于先验配置的静态基线。此外,NORACL生成的架构具有可解释的生长模式:差异性任务主要扩展特征提取层,而依赖共享特征的任务则将生长转向后期的特征组合层。进一步分析表明,固定容量网络会随任务积累丧失可塑性,而NORACL通过生长机制为每个新任务创建全新容量。这些结果共同证明,自适应神经发生能够推动持续学习在稳定性-可塑性帕累托前沿上的突破。