Large language models (LLMs) have emerged as important components across various fields, yet their training requires substantial computation resources and abundant labeled data. It poses a challenge to robustly training LLMs for individual users (clients). To tackle this challenge, the intuitive idea is to introduce federated learning (FL), which can collaboratively train models on distributed private data. However, existing methods suffer from the challenges of data heterogeneity, system heterogeneity, and model size, resulting in suboptimal performance and high costs. In this work, we proposed a variant of personalized federated learning (PFL) framework, namely FDLoRA, which allows the client to be a single device or a cluster and adopts low-rank adaptation (LoRA) tuning. FDLoRA sets dual LoRA modules on each client to capture personalized and global knowledge, respectively, and only the global LoRA module uploads parameters to the central server to aggregate cross-client knowledge. Finally, an adaptive fusion approach is employed to combine the parameters of the dual LoRAs. This enables FDLoRA to make effective use of private data distributed across different clients, thereby improving performance on the client without incurring high communication and computing costs. We conducted extensive experiments in two practice scenarios. The results demonstrate that FDLoRA outperforms six baselines in terms of performance, stability, robustness, computation cost, and communication cost.
翻译:大语言模型已成为各领域的重要组成部分,但其训练需要大量计算资源和充足的标注数据。这为面向个体用户(客户端)稳健训练大语言模型带来了挑战。为解决该问题,直观思路是引入联邦学习,通过在分布式私有数据上协同训练模型。然而,现有方法面临数据异质性、系统异质性和模型规模等挑战,导致性能次优且成本高昂。本研究提出了一种个性化联邦学习框架变体——FDLoRA,支持客户端为单个设备或集群,并采用低秩适配调优。FDLoRA为每个客户端设置双LoRA模块,分别捕获个性化知识与全局知识,仅全局LoRA模块将参数上传至中央服务器以聚合跨客户端知识。最终通过自适应融合方法整合双LoRA参数,使FDLoRA能够有效利用分布在不同客户端的私有数据,在不产生高通信与计算成本的前提下提升客户端性能。我们在两种实际场景中进行了广泛实验,结果表明FDLoRA在性能、稳定性、鲁棒性、计算成本与通信成本方面均优于六种基线方法。