Federated Learning (FL) offers a promising pathway for collaboratively fine-tuning Large Language Models (LLMs) at the edge; however, this paradigm faces a critical bottleneck: the prohibitive communication and memory overheads incurred by exchanging high-dimensional gradients. Furthermore, recent studies reveal that user training data can still be recovered from these local gradients, undermining the core privacy promise of FL. In this paper, we address this trilemma of communication, memory, and privacy by proposing pAirZero, a novel framework that synergizes Zeroth-Order (ZO) optimization with Over-the-Air (OTA) computation. Uniquely, pAirZero enables resource-constrained devices to submit their local gradient with only bit-level communication loads while participating in federated fine-tuning of LLMs with inference-level memory costs. This approach not only eliminates the high memory requirements needed for LLM fine-tuning but also alleviates the strict synchronization requirements that plague conventional OTA methods. We further formulate a rigorous optimization model to adaptively determine the optimal transmit power and noise levels, ensuring consistent privacy protection regardless of channel conditions. Numerical experiments demonstrate the superiority of pAirZero in enabling secure, efficient LLM fine-tuning over wireless networks, with only 25% peak memory cost on OPT-125M and communication load orders of magnitude lower than conventional methods.
翻译:联邦学习为大语言模型在边缘设备上的协同微调提供了一条可行路径,然而该范式面临一个关键瓶颈:交换高维梯度所带来的巨大通信与内存开销。此外,近期研究表明,用户训练数据仍可从这些局部梯度中恢复,这破坏了联邦学习的核心隐私承诺。本文针对通信、内存与隐私的三难困境,提出了pAirZero框架。该框架创新性地将零阶优化与空中计算相结合。独特的是,pAirZero使得资源受限设备在参与大语言模型联邦微调时,能以推理级别的内存开销提交仅需比特级通信负载的局部梯度。该方法不仅消除了大语言模型微调所需的高内存需求,还缓解了传统空中计算方法中严格的同步要求。我们进一步构建了严格优化模型,自适应确定最优发射功率与噪声水平,确保无论信道条件如何都能实现一致的隐私保护。数值实验表明,pAirZero在OPT-125M模型上仅需25%峰值内存成本,且通信负载较传统方法低数个数量级,实现了无线网络环境下安全高效的大语言模型微调。