Federated fine-tuning enables privacy-preserving LLM adaptation but faces a critical bottleneck: the disparity between LLMs' high memory demands and edge devices' limited capacity. To break the memory barrier, we propose Chain Federated Fine-Tuning (ChainFed), an innovative paradigm that forgoes end-to-end updates in favor of a sequential, layer-by-layer manner. It first trains the initial adapter to convergence, freezes its weights, and then proceeds to the next. This iterative train-and-freeze process forms an optimization chain, gradually enhancing the model's task-specific proficiency. ChainFed further integrates three core techniques: 1) Dynamic Layer Co-Tuning to bridge semantic gaps between sequentially tuned layers and facilitate information flow; 2) Globally Perceptive Optimization to endow each adapter with foresight beyond its local objective; 3) Function-Oriented Adaptive Tuning to automatically identify the optimal fine-tuning starting point. Extensive experiments on multiple benchmarks demonstrate the superiority of ChainFed over existing methods, boosting average accuracy by up to 46.46\%.
翻译:联邦微调在实现隐私保护的大语言模型适配方面潜力巨大,却面临关键瓶颈:大语言模型的高内存需求与边缘设备有限容量之间的显著差距。为突破内存限制,我们提出链式联邦微调框架ChainFed——一种摒弃端到端更新模式、采用逐层顺序更新的创新范式。该框架首先训练初始适配器至收敛,冻结其权重后再依次训练后续适配器。这种迭代式"训练-冻结"过程形成优化链,逐步提升模型的任务专项能力。ChainFed进一步整合三大核心技术:1)动态层协同调优,弥合顺序调优层间的语义鸿沟并促进信息流动;2)全局感知优化,赋予每个适配器超越局部目标的全局视野;3)功能导向自适应调优,自动识别最优微调起始点。在多个基准测试上的大量实验表明,ChainFed相较现有方法具有显著优势,平均准确率最高可提升46.46%。