On-device learning is essential for personalization, privacy, and long-term adaptation in resource-constrained environments. Achieving this requires efficient learning, both fine-tuning existing models and continually acquiring new tasks without catastrophic forgetting. Yet both settings are constrained by high memory cost of storing activations during backpropagation. Existing activation compression methods reduce this cost but rely on repeated low-rank decompositions, introducing computational overhead. Also, such methods have not been explored for continual learning. We propose LANCE (Low-rank Activation Compression), a framework that performs one-shot higher-order Singular Value Decomposition (SVD) to obtain a reusable low-rank subspace for activation projection. This eliminates repeated decompositions, reducing both memory and computation. Moreover, fixed low-rank subspaces further enable on-device continual learning by allocating tasks to orthogonal subspaces without storing large task-specific matrices. Experiments show that LANCE reduces activation storage up to 250$\times$ while maintaining accuracy comparable to full backpropagation on CIFAR-10/100, Oxford-IIIT Pets, Flowers102, and CUB-200 datasets. On continual learning benchmarks (Split CIFAR-100, Split MiniImageNet, 5-Datasets), it performs competitively with orthogonal gradient projection methods at a fraction of the memory cost. These results position LANCE as a practical and scalable solution for efficient fine-tuning and continual learning on edge devices.
翻译:设备端学习对于资源受限环境中的个性化、隐私保护和长期适应至关重要。实现这一目标需要高效的学习能力,既要微调现有模型,又要持续获取新任务而不发生灾难性遗忘。然而,这两种场景均受限于反向传播过程中存储激活值的高昂内存成本。现有激活压缩方法虽能降低该成本,但依赖重复的低秩分解操作,引入了额外计算开销,且尚未被探索应用于持续学习场景。我们提出LANCE(低秩激活压缩)框架,通过单次高阶奇异值分解获得可复用的低秩子空间用于激活投影,从而消除重复分解过程,同时降低内存与计算成本。此外,固定低秩子空间进一步支持设备端持续学习:通过将任务分配到正交子空间,无需存储大规模任务专用矩阵。实验表明,在CIFAR-10/100、Oxford-IIIT Pets、Flowers102和CUB-200数据集上,LANCE可将激活存储量压缩至原来的1/250倍以下,同时保持与完整反向传播相当的精度。在持续学习基准测试(Split CIFAR-100、Split MiniImageNet、5-Datasets)中,该方法以极低内存成本达到了与正交梯度投影方法相媲美的性能。这些结果使LANCE成为边缘设备上高效微调与持续学习的实用可扩展解决方案。