Despite the versatility of pre-trained language models (PLMs) across domains, their large memory footprints pose significant challenges in federated learning (FL), where the training model has to be distributed between a server and clients. One potential solution to bypass such constraints might be the use of parameter-efficient fine-tuning (PEFT) in the context of FL. However, we have observed that typical PEFT tends to severely suffer from heterogeneity among clients in FL scenarios, resulting in unstable and slow convergence. In this paper, we propose Client-Customized Adaptation (C2A), a novel hypernetwork-based FL framework that generates client-specific adapters by conditioning the client information. With the effectiveness of the hypernetworks in generating customized weights through learning to adopt the different characteristics of inputs, C2A can maximize the utility of shared model parameters while minimizing the divergence caused by client heterogeneity. To verify the efficacy of C2A, we perform extensive evaluations on FL scenarios involving heterogeneity in label and language distributions. Comprehensive evaluation results clearly support the superiority of C2A in terms of both efficiency and effectiveness in FL scenarios.
翻译:尽管预训练语言模型(PLM)在不同领域展现出广泛适用性,但其庞大的内存占用对联邦学习(FL)构成了显著挑战,因为训练模型必须在服务器与客户端之间进行分布式部署。一种规避此类约束的潜在解决方案是在FL场景中采用参数高效微调(PEFT)技术。然而,我们观察到典型的PEFT方法在FL场景中极易受到客户端异构性的影响,导致收敛过程不稳定且缓慢。本文提出客户端定制化适配(C2A),这是一种基于超网络的新型FL框架,通过将客户端信息作为条件输入来生成客户端专属的适配器。借助超网络通过学习适配输入数据不同特征来生成定制化权重的有效性,C2A能够在最大化共享模型参数利用率的同时,最小化由客户端异构性引起的参数漂移。为验证C2A的有效性,我们在包含标签分布与语言分布异构性的FL场景中进行了广泛评估。综合评估结果明确证实了C2A在FL场景中兼具效率与性能优势。