Adapting Foundation Models (FMs) for downstream tasks through Federated Learning (FL) emerges a promising strategy for protecting data privacy and valuable FMs. Existing methods fine-tune FM by allocating sub-FM to clients in FL, however, leading to suboptimal performance due to insufficient tuning and inevitable error accumulations of gradients. In this paper, we propose Federated Proxy Fine-Tuning (FedPFT), a novel method enhancing FMs adaptation in downstream tasks through FL by two key modules. First, the sub-FM construction module employs a layer-wise compression approach, facilitating comprehensive FM fine-tuning across all layers by emphasizing those crucial neurons. Second, the sub-FM alignment module conducts a two-step distillations-layer-level and neuron-level-before and during FL fine-tuning respectively, to reduce error of gradient by accurately aligning sub-FM with FM under theoretical guarantees. Experimental results on seven commonly used datasets (i.e., four text and three vision) demonstrate the superiority of FedPFT.
翻译:通过联邦学习(Federated Learning, FL)将基础模型(Foundation Models, FMs)适配至下游任务,是一种保护数据隐私与模型价值的可行策略。现有方法通过向联邦学习客户端分配子模型(sub-FM)进行微调,但因调优不充分及梯度误差累积导致性能次优。本文提出联邦代理微调(Federated Proxy Fine-Tuning, FedPFT),一种通过两大核心模块增强基础模型在下游任务中适配性的新颖方法。首先,子模型构建模块采用逐层压缩策略,通过强调关键神经元实现跨全部层的全面基础模型微调;其次,子模型对齐模块在联邦学习微调前后分别执行两层蒸馏(层级与神经元级),在理论保证下通过精确对齐子模型与基础模型来降低梯度误差。在七个常用数据集(含四个文本与三个视觉数据集)上的实验结果表明,FedPFT具有显著优越性。