Towards Building the Federated GPT: Federated Instruction Tuning

While "instruction-tuned" generative large language models (LLMs) have demonstrated an impressive ability to generalize to new tasks, the training phases heavily rely on large amounts of diverse and high-quality instruction data (such as ChatGPT and GPT-4). Unfortunately, acquiring high-quality data, especially when it comes to human-written data, can pose significant challenges both in terms of cost and accessibility. Moreover, concerns related to privacy can further limit access to such data, making the process of obtaining it a complex and nuanced undertaking. Consequently, this hinders the generality of the tuned models and may restrict their effectiveness in certain contexts. To tackle this issue, our study introduces a new approach called Federated Instruction Tuning (FedIT), which leverages federated learning (FL) as the learning framework for the instruction tuning of LLMs. This marks the first exploration of FL-based instruction tuning for LLMs. This is especially important since text data is predominantly generated by end users. Therefore, it is imperative to design and adapt FL approaches to effectively leverage these users' diverse instructions stored on local devices, while preserving privacy and ensuring data security. In the current paper, by conducting widely used GPT-4 auto-evaluation, we demonstrate that by exploiting the heterogeneous and diverse sets of instructions on the client's end with the proposed framework FedIT, we improved the performance of LLMs compared to centralized training with only limited local instructions. Further, in this paper, we developed a Github repository named Shepherd. This repository offers a foundational framework for exploring federated fine-tuning of LLMs using heterogeneous instructions across diverse categories.

翻译：尽管“指令微调”后的生成式大型语言模型（LLM）已展现出对新任务的卓越泛化能力，但训练阶段严重依赖大量多样化且高质量的指令数据（如ChatGPT和GPT-4）。然而，获取高质量数据（尤其是人工编写的数据）在成本和可获取性方面均面临重大挑战。此外，隐私相关问题进一步限制了此类数据的获取，使得数据收集过程变得复杂而微妙。这阻碍了微调模型的通用性，并可能限制其在特定场景下的有效性。为解决这一问题，本研究提出了一种名为“联邦指令微调”（FedIT）的新方法，该方法利用联邦学习（FL）作为LLM指令微调的学习框架。这是首次探索基于FL的LLM指令微调方法。由于文本数据主要由最终用户生成，因此设计和适配FL方法以有效利用本地设备上存储的用户多样化指令，同时保护隐私并确保数据安全至关重要。本文通过广泛使用的GPT-4自动评估证明，利用客户端异构且多样化的指令集，结合所提出的FedIT框架，相比仅使用有限本地指令的集中式训练，我们提升了LLM的性能。此外，本文开发了名为Shepherd的GitHub仓库。该仓库为探索跨不同类别异构指令的LLM联邦微调提供了基础框架。