Federated learning (FL) has emerged as a promising paradigm for fine-tuning foundation models using distributed data in a privacy-preserving manner. Under limited computational resources, clients often find it more practical to fine-tune a selected subset of layers, rather than the entire model, based on their task-specific data. In this study, we provide a thorough theoretical exploration of selective layer fine-tuning in FL, emphasizing a flexible approach that allows the clients to adjust their selected layers according to their local data and resources. We theoretically demonstrate that the layer selection strategy has a significant impact on model convergence in two critical aspects: the importance of selected layers and the heterogeneous choices across clients. Drawing from these insights, we further propose a strategic layer selection method that utilizes local gradients and regulates layer selections across clients. The extensive experiments on both image and text datasets demonstrate the effectiveness of the proposed strategy compared with several baselines, highlighting its advances in identifying critical layers that adapt to the client heterogeneity and training dynamics in FL.
翻译:联邦学习(FL)已成为一种利用分布式数据以隐私保护方式微调基础模型的前沿范式。在有限的计算资源下,客户端通常发现基于其特定任务数据微调选定层子集(而非整个模型)更具实践可行性。本研究对联邦学习中的选择性层微调进行了深入的理论探索,强调一种灵活方法,允许客户端根据其本地数据和资源调整所选层。我们从理论上证明,层选择策略在两个方面对模型收敛具有显著影响:所选层的重要性以及客户端间的异构选择。基于这些发现,我们进一步提出一种策略性层选择方法,该方法利用本地梯度并规范客户端间的层选择。在图像和文本数据集上的大量实验表明,所提策略相较于多种基线方法具有显著优势,突显其在识别适应联邦学习中客户端异构性和训练动态的关键层方面的先进性。