Fine-tuning large vision models (LVMs) and large language models (LLMs) under differentially private federated learning (DPFL) is hindered by a fundamental privacy-utility trade-off. Low-Rank Adaptation (LoRA), a promising parameter-efficient fine-tuning (PEFT) method, reduces computational and communication costs by introducing two trainable low-rank matrices while freezing pre-trained weights. However, directly applying LoRA in DPFL settings leads to performance degradation, especially in LVMs. Our analysis reveals three previously underexplored challenges: (1) gradient coupling caused by the simultaneous update of two asymmetric low-rank matrices, (2) compounded noise amplification under differential privacy, and (3) sharpness of the global aggregated model in the parameter space. To address these issues, we propose LA-LoRA (\textbf{L}ocal \textbf{A}lternating \textbf{LoRA}), a novel approach that decouples gradient interactions and aligns update directions across clients to enhance robustness under stringent privacy constraints. Theoretically, LA-LoRA strengthens convergence guarantees in noisy federated environments. Extensive experiments demonstrate that LA-LoRA achieves state-of-the-art (SOTA) performance on Swin Transformer and RoBERTa models, showcasing robustness to DP noise and broad applicability across both LVMs and LLMs. For example, when fine-tuning the Swin-B model on the Tiny-ImageNet dataset under a strict privacy budget ($ε= 1$), LA-LoRA outperforms the best baseline, RoLoRA, by 16.83\% in test accuracy. Code is provided in \repolink.
翻译:在差分隐私联邦学习(DPFL)框架下对大型视觉模型(LVMs)和大型语言模型(LLMs)进行微调,受到一个根本性的隐私-效用权衡的制约。低秩自适应(LoRA)作为一种有前景的参数高效微调(PEFT)方法,通过引入两个可训练的低秩矩阵并冻结预训练权重,降低了计算和通信成本。然而,在DPFL设置中直接应用LoRA会导致性能下降,尤其是在LVMs中。我们的分析揭示了三个先前未被充分探讨的挑战:(1)由两个非对称低秩矩阵同时更新引起的梯度耦合,(2)差分隐私下的复合噪声放大,以及(3)参数空间中全局聚合模型的尖锐性。为了解决这些问题,我们提出了LA-LoRA(\textbf{L}ocal \textbf{A}lternating \textbf{LoRA}),一种新颖的方法,它解耦了梯度交互并在客户端间对齐更新方向,以增强在严格隐私约束下的鲁棒性。理论上,LA-LoRA在噪声联邦环境中强化了收敛保证。大量实验表明,LA-LoRA在Swin Transformer和RoBERTa模型上实现了最先进的(SOTA)性能,展示了对DP噪声的鲁棒性以及在LVMs和LLMs上的广泛适用性。例如,在严格隐私预算($ε= 1$)下,于Tiny-ImageNet数据集上微调Swin-B模型时,LA-LoRA在测试准确率上比最佳基线方法RoLoRA高出16.83\%。代码已在\repolink中提供。