When Does Structure Matter in Continual Learning? Dimensionality Controls When Modularity Shapes Representational Geometry

To preserve previously learned representations, continual learning systems must strike a balance between plasticity, the ability to acquire new knowledge, and stability. This stability-plasticity dilemma affects how representations can be reused across tasks: shared structure enables transfer when tasks are similar but may also induce interference when new learning disrupts existing representations. However, it remains unclear when and why structural separation influences this trade-off. In this study, we examine how network architecture, task similarity, and representational dimensionality jointly shape learning in a sequential task paradigm inspired by transfer-interference studies. We compare a task-partitioned modular recurrent network with a single-module baseline by systematically varying task similarity (low, medium, high) and the scale of weight initialization, which induces different learning regimes that we empirically characterize through the effective dimensionality of the learned representations. We find that architecture has minimal impact in high-dimensional regimes where representations are sufficiently unconstrained to accommodate multiple tasks without strong interference. In contrast, in lower-dimensional (rich) regimes, architectural separation is decisive: modular networks exhibit graded alignment of task-specific subspaces with overlap for similar tasks, partial orthogonalization for moderately dissimilar tasks, and stronger separation for dissimilar tasks. This graded geometry is absent in the single network baseline. Our findings suggest that representational dimensionality acts as a key organizing variable governing when structural separation becomes functionally relevant, and highlight adaptive geometry as a central principle for designing continual learning systems.

翻译：为保护已学习到的表征，持续学习系统必须在可塑性（获取新知识的能力）与稳定性之间寻求平衡。这种稳定性-可塑性困境影响表征在任务间的复用方式：当任务相似时，共享结构有利于迁移，但新学习过程也可能因干扰现有表征而引发冲突。然而，目前尚不明确结构分离在何时、以何种方式影响这一权衡。本研究以迁移-干扰研究为启发，在序列任务范式下系统考察网络架构、任务相似度与表征维度如何共同塑造学习过程。我们通过系统调控任务相似度（低、中、高）与权重初始化尺度，对比任务分区模块化递归网络与单模块基线网络。权重初始化尺度会诱导不同学习机制，我们通过所学表征的有效维度对其进行实证刻画。研究发现：在高维度学习机制中，由于表征空间足够宽松可容纳多任务且无强干扰，架构影响微乎其微；相反，在低维度（丰富）机制中，架构分离起决定性作用——模块化网络展现出任务特异子空间的分级对齐模式：相似任务间存在重叠，中等差异任务间呈现部分正交化，差异显著任务间则保持强分离。这种分级几何特性在单网络基线中完全缺失。我们的研究揭示：表征维度是控制结构分离何时产生功能相关性的关键组织变量，并强调自适应几何特性是设计持续学习系统的核心原则。