Fine-tuning large language models on sensitive data poses significant privacy risks, as membership inference attacks can reveal whether individual records were used during training. While Differential Privacy (DP) provides formal protection, applying DP to conventional Parameter-Efficient Fine-Tuning (PEFT) methods such as Low-Rank Adaptation (LoRA) often incurs substantial utility loss. In this work, we show that a more structurally constrained PEFT architecture, Tensor Train Low-Rank Adaptation (TTLoRA), can improve the privacy-utility tradeoff by shrinking the effective parameter space while preserving expressivity. To this end, we develop TTLoRA-DP, a differentially private training framework for TTLoRA. Specifically, we extend the ghost clipping algorithm to Tensor Train cores via cached contraction states, enabling efficient Differentially Private Stochastic Gradient Descent (DP-SGD) with exact per-example gradient norm computation without materializing full per-example gradients. Experiments on GPT-2 fine-tuning over the Enron and Penn Treebank datasets show that TTLoRA-DP consistently strengthens privacy protection relative to LoRA-DP while maintaining comparable or better downstream utility. Moreover, TTLoRA exhibits lower membership leakage even without DP training, using substantially smaller adapters and requiring on average 7.6X fewer parameters than LoRA. Overall, our results demonstrate that TTLoRA offers a practical path to improving the privacy-utility tradeoff in parameter-efficient language model adaptation.
翻译:在敏感数据上微调大型语言模型存在显著的隐私风险,因为成员推理攻击可能揭示训练过程中是否使用了特定个体记录。虽然差分隐私(DP)提供了形式化保护,但将DP应用于传统参数高效微调(PEFT)方法(如低秩适应(LoRA))通常会导致显著的效用损失。本研究表明,一种结构约束更强的PEFT架构——张量链低秩适应(TTLoRA),能够通过压缩有效参数空间同时保持表达能力,从而改善隐私-效用权衡。为此,我们开发了TTLoRA-DP,一个针对TTLoRA的差分隐私训练框架。具体而言,我们通过缓存收缩状态将幽灵裁剪算法扩展至张量链核心,实现了高效的差分隐私随机梯度下降(DP-SGD),并支持精确的逐样本梯度范数计算,而无需生成完整的逐样本梯度。在基于Enron和Penn Treebank数据集的GPT-2微调实验中,TTLoRA-DP相较于LoRA-DP始终能增强隐私保护,同时保持相当或更优的下游任务效用。此外,即使在没有DP训练的情况下,TTLoRA也表现出更低的成员信息泄漏,其适配器规模显著更小,平均所需参数比LoRA少7.6倍。总体而言,我们的结果表明TTLoRA为改进参数高效语言模型适应中的隐私-效用权衡提供了一条实用路径。