Low-rank adaptation (LoRA) offers an efficient alternative to full-weight adaptation in federated fine-tuning of language models, significantly reducing computational costs. By adjusting ranks for each client, federated LoRA enables flexible resource allocation. However, we observe that heterogeneous ranks among clients lead to unstable performance. Our analysis attributes this instability to the conventional zero-padding aggregation strategy, which dilutes information from high-rank clients during model aggregation. To address this issue, we propose a replication-based padding strategy that better retains valuable information from clients with high-quality data. Empirically, this approach accelerates convergence and enhances the global model's predictive performance.
翻译:低秩适配(LoRA)为语言模型的联邦微调提供了一种高效的全权重适配替代方案,显著降低了计算成本。通过为每个客户端调整秩,联邦LoRA实现了灵活的资源分配。然而,我们观察到客户端之间的异质秩会导致性能不稳定。我们的分析将这种不稳定性归因于传统的零填充聚合策略,该策略在模型聚合过程中稀释了来自高秩客户端的信息。为解决此问题,我们提出了一种基于复制的填充策略,能更好地保留来自具有高质量数据客户端的有价值信息。实验表明,该方法加速了收敛并提升了全局模型的预测性能。