As a promising paradigm to collaboratively train models with decentralized data, Federated Learning (FL) can be exploited to fine-tune Large Language Models (LLMs). While LLMs correspond to huge size, the scale of the training data significantly increases, which leads to tremendous amounts of computation and communication costs. The training data is generally non-Independent and Identically Distributed (non-IID), which requires adaptive data processing within each device. Although Low Rank Adaptation (LoRA) can significantly reduce the scale of parameters to update in the fine-tuning process, it still takes unaffordable time to transfer the low-rank parameters of all the layers in LLMs. In this paper, we propose a Fisher Information-based Efficient Curriculum Federated Learning framework (FibecFed) with two novel methods, i.e., adaptive federated curriculum learning and efficient sparse parameter update. First, we propose a fisher information-based method to adaptively sample data within each device to improve the effectiveness of the FL fine-tuning process. Second, we dynamically select the proper layers for global aggregation and sparse parameters for local update with LoRA so as to improve the efficiency of the FL fine-tuning process. Extensive experimental results based on 10 datasets demonstrate that FibecFed yields excellent performance (up to 45.35% in terms of accuracy) and superb fine-tuning speed (up to 98.61% faster) compared with 17 baseline approaches).
翻译:作为一种利用分散数据协同训练模型的有前景范式,联邦学习(FL)可用于微调大语言模型(LLMs)。鉴于LLMs参数量巨大,训练数据规模亦显著增加,导致计算与通信成本急剧上升。训练数据通常是非独立同分布(non-IID)的,这要求在每个设备内进行自适应数据处理。尽管低秩自适应(LoRA)能显著减少微调过程中需更新的参数量,但传输LLMs所有层的低秩参数仍耗时过高。本文提出一种基于Fisher信息的高效课程联邦学习框架(FibecFed),包含两种新颖方法:自适应联邦课程学习与高效稀疏参数更新。首先,我们提出一种基于Fisher信息的方法,在每个设备内自适应采样数据,以提升FL微调过程的有效性。其次,我们动态选择合适层进行全局聚合,并利用LoRA选择稀疏参数进行本地更新,从而提高FL微调过程的效率。基于10个数据集的广泛实验结果表明,与17种基线方法相比,FibecFed在性能(准确率最高提升45.35%)和微调速度(最高加速98.61%)方面均表现出色。