Privacy-preserving adaptation of Large Language Models (LLMs) in sensitive domains (e.g., mental health) requires balancing strict confidentiality with model utility and safety. We propose FedMentor, a federated fine-tuning framework that integrates Low-Rank Adaptation (LoRA) and domain-aware Differential Privacy (DP) to meet per-domain privacy budgets while maintaining performance. Each client (domain) applies a custom DP noise scale proportional to its data sensitivity, and the server adaptively reduces noise when utility falls below a threshold. In experiments on three mental health datasets, we show that FedMentor improves safety over standard Federated Learning (FL) without privacy, raising safe output rates by up to three points and lowering toxicity, while maintaining utility (BERTScore F1 and ROUGE-L) within 0.5% of the non-private baseline and close to the centralized upper bound. The framework scales to backbones with up to 1.7B parameters on single-GPU clients, requiring < 173 MB of communication per-round. FedMentor demonstrates a practical approach to privately fine-tune LLMs for safer deployments in healthcare and other sensitive fields.
翻译:在敏感领域(如心理健康)中对大语言模型进行隐私保护的适配,需要在严格的机密性与模型效用及安全性之间取得平衡。我们提出了FedMentor,一个联邦微调框架,它集成了低秩适配和领域感知差分隐私,以满足各领域的隐私预算,同时保持性能。每个客户端(领域)根据其数据敏感度应用定制的差分隐私噪声尺度,当效用低于阈值时,服务器会自适应地降低噪声。在三个心理健康数据集上的实验表明,与无隐私保护的标准联邦学习相比,FedMentor提高了安全性,将安全输出率提升了多达三个百分点并降低了毒性,同时将效用保持在非隐私基线的0.5%以内,并接近集中式上限。该框架可扩展至单GPU客户端上参数高达17亿的骨干模型,每轮通信需求小于173 MB。FedMentor展示了一种实用的方法,可在医疗保健及其他敏感领域中对大语言模型进行私有微调,以实现更安全的部署。