Low-rank adaptation of language models has been proposed to reduce the computational and memory overhead of fine-tuning pre-trained language models. LoRA incorporates trainable low-rank matrices into some parameters of the pre-trained model, called adapters. In this work, we show theoretically that the low-rank adaptation mechanism of LoRA is equivalent to fine-tuning adapters with noisy batch gradients, with the noise variance being a decreasing function of adaptation rank ($r$). Motivated by this understanding, we prove inherent differential privacy for LoRA when adaptation matrices $A_\ell$ are frozen. We show that various factors, e.g., the adaptation rank and batch size, affect the guaranteed privacy level. Our findings provide useful insights into LoRA and uncovers the reason behind the robustness of models fine-tuned with LoRA to privacy attacks.
翻译:语言模型的低秩适配已被提出,用于降低预训练语言模型微调的计算和内存开销。LoRA将可训练的低秩矩阵(称为适配器)融入预训练模型的某些参数中。本工作中,我们从理论上证明LoRA的低秩适配机制等效于使用带噪声的批次梯度微调适配器,噪声方差是适配秩($r$)的递减函数。基于这一理解,我们证明了当适配矩阵$A_\ell$被冻结时,LoRA具有固有的差分隐私特性。研究表明,适配秩和批次大小等多种因素会影响可保证的隐私级别。这些发现为理解LoRA提供了有价值的见解,并揭示了使用LoRA微调的模型对隐私攻击具有鲁棒性的内在原因。