Large language models (LLMs) show amazing performance on many domain-specific tasks after fine-tuning with some appropriate data. However, many domain-specific data are privately distributed across multiple owners. Thus, this dilemma raises the interest in how to perform LLM fine-tuning in federated learning (FL). However, confronted with limited computation and communication capacities, FL clients struggle to fine-tune an LLM effectively. To this end, we introduce FedBiOT, a resource-efficient LLM fine-tuning approach to FL. Specifically, our method involves the server generating a compressed LLM and aligning its performance with the full model. Subsequently, the clients fine-tune a lightweight yet important part of the compressed model, referred to as an adapter. Notice that as the server has no access to the private data owned by the clients, the data used for alignment by the server has a different distribution from the one used for fine-tuning by clients. We formulate the problem into a bi-level optimization problem to minimize the negative effect of data discrepancy and derive the updating rules for the server and clients. We conduct extensive experiments on LLaMA-2, empirically showing that the adapter has exceptional performance when reintegrated into the global LLM. The results also indicate that the proposed FedBiOT significantly reduces resource consumption compared to existing benchmarks, all while achieving comparable performance levels.
翻译:大语言模型(LLM)在经过适当数据微调后,在众多领域特定任务上展现出卓越性能。然而,许多领域特定数据以私有形式分散在多个所有者之间。这一困境引发了如何在联邦学习(FL)框架下进行LLM微调的广泛关注。然而,面对有限的计算与通信能力,联邦学习客户端难以有效完成LLM微调。为此,我们提出FedBiOT——一种面向联邦学习的高效资源LLM微调方法。具体而言,该方法首先由服务器生成压缩版LLM,并将其性能与完整模型对齐;随后,客户端仅对压缩模型中轻量级但关键的部分(即适配器)进行微调。需特别指出,由于服务器无法访问客户端的私有数据,服务器用于对齐的数据分布与客户端微调所用数据存在差异。我们将该问题建模为双层优化问题以最小化数据分布差异的负面影响,并推导出服务器与客户端的更新规则。基于LLaMA-2的广泛实验表明,适配器重新集成至全局LLM时具有优异性能。实验结果同时证明,相较于现有基准方法,所提出的FedBiOT在保持相当性能水平的同时,显著降低了资源消耗。