Fixed representational capacity is a fundamental constraint in continual learning: practitioners must guess an appropriate model width before training, without knowing how many distinct concepts the data contains. We propose LACE (Loss-Adaptive Capacity Expansion), a simple online mechanism that expands a model's representational capacity during training by monitoring its own loss signal. When sustained loss deviation exceeds a threshold - indicating that the current capacity is insufficient for newly encountered data - LACE adds new dimensions to the projection layer and trains them jointly with existing parameters. Across synthetic and real-data experiments, LACE triggers expansions exclusively at domain boundaries (100% boundary precision, zero false positives), matches the accuracy of a large fixed-capacity model while starting from a fraction of its dimensions, and produces adapter dimensions that are collectively critical to performance (3% accuracy drop when all adapters removed). We further demonstrate unsupervised domain separation in GPT-2 activations via layer-wise clustering, showing a U-shaped separability curve across layers that motivates adaptive capacity allocation in deep networks. LACE requires no labels, no replay buffers, and no external controllers, making it suitable for on-device continual learning under resource constraints.
翻译:固定表示容量是持续学习中的一个基本限制:从业者必须在训练前猜测合适的模型宽度,而无需知道数据包含多少不同概念。我们提出LACE(损失自适应容量扩展),这是一种简单的在线机制,通过监控自身的损失信号在训练过程中扩展模型的表示容量。当持续损失偏差超过阈值——表明当前容量不足以处理新遇到的数据时——LACE向投影层添加新维度,并将其与现有参数联合训练。在合成数据和真实数据实验中,LACE仅在领域边界处触发扩展(100%边界精度,零假阳性),在从较小维度开始时匹配大固定容量模型的准确率,并产生对性能至关重要的适配器维度(移除所有适配器时准确率下降3%)。我们进一步通过逐层聚类展示了GPT-2激活中的无监督领域分离,呈现出跨层的U形可分性曲线,这激发了深度网络中自适应容量分配的需求。LACE无需标签、无需回放缓冲区、无需外部控制器,使其适用于资源约束下设备端的持续学习。