Large Language Models (LLMs) have demonstrated impressive success across various tasks. Integrating LLMs with Federated Learning (FL), a paradigm known as FedLLM, offers a promising avenue for collaborative model adaptation while preserving data privacy. This survey provides a systematic and comprehensive review of FedLLM. We begin by tracing the historical development of both LLMs and FL, summarizing relevant prior research to set the context. Subsequently, we delve into an in-depth analysis of the fundamental challenges inherent in deploying FedLLM. Addressing these challenges often requires efficient adaptation strategies; therefore, we conduct an extensive examination of existing Parameter-Efficient Fine-tuning (PEFT) methods and explore their applicability within the FL framework. To rigorously evaluate the performance of FedLLM, we undertake a thorough review of existing fine-tuning datasets and evaluation benchmarks. Furthermore, we discuss FedLLM's diverse real-world applications across multiple domains. Finally, we identify critical open challenges and outline promising research directions to foster future advancements in FedLLM. This survey aims to serve as a foundational resource for researchers and practitioners, offering valuable insights into the rapidly evolving landscape of federated fine-tuning for LLMs. It also establishes a roadmap for future innovations in privacy-preserving AI. We actively maintain a \href{https://github.com/Clin0212/Awesome-Federated-LLM-Learning}{GitHub repo} to track cutting-edge advancements in this field.
翻译:大规模语言模型(LLMs)已在各类任务中展现出卓越性能。将LLMs与联邦学习(FL)范式相结合形成的FedLLM,为在保障数据隐私的前提下实现协同模型适配提供了重要路径。本综述对FedLLM领域进行系统性与全面性梳理。首先追溯LLMs与FL的历史发展脉络,总结相关前期研究以确立背景框架。随后深入剖析FedLLM部署过程中固有的基础性挑战。针对这些挑战往往需要高效的适配策略,因此本文对现有参数高效微调(PEFT)方法展开详尽考察,并探讨其在FL框架中的适用性。为系统评估FedLLM性能,我们对现有微调数据集与评估基准进行全面评述。进一步地,本文探讨FedLLM在跨领域实际场景中的多样化应用。最后,我们指出现存的关键开放挑战,并展望潜在研究方向以推动FedLLM领域的未来发展。本综述旨在为研究人员与实践者提供基础性参考资源,为快速演进的大规模语言模型联邦微调领域提供有价值的见解,同时为隐私保护人工智能的未来创新确立发展路线图。我们通过持续维护的\href{https://github.com/Clin0212/Awesome-Federated-LLM-Learning}{GitHub代码库}追踪该领域的前沿进展。