The proliferation of Large Language Models (LLMs) has driven considerable interest in fine-tuning them with domain-specific data to create specialized language models. Nevertheless, such domain-specific fine-tuning data often contains sensitive personally identifiable information (PII). Direct fine-tuning LLMs on this data without privacy protection poses a risk of leakage. To address this challenge, we introduce Privacy Protection Language Models (PPLM), a novel paradigm for fine-tuning LLMs that effectively injects domain-specific knowledge while safeguarding data privacy. Our work offers a theoretical analysis for model design and delves into various techniques such as corpus curation, penalty-based unlikelihood in training loss, and instruction-based tuning, etc. Extensive experiments across diverse datasets and scenarios demonstrate the effectiveness of our approaches. In particular, instruction tuning with both positive and negative examples, stands out as a promising method, effectively protecting private data while enhancing the model's knowledge. Our work underscores the potential for Large Language Models as robust privacy protection learners.
翻译:大型语言模型的普及引发了广泛关注,即使用领域特定数据对其进行微调以创建专业语言模型。然而,此类领域特定微调数据通常包含敏感的个人身份信息。直接在未加隐私保护的情况下对这些数据进行微调存在泄露风险。为解决这一挑战,我们提出隐私保护语言模型——一种新颖的大型语言模型微调范式,可在保护数据隐私的同时有效注入领域特定知识。本研究为模型设计提供理论分析,并深入探讨了语料库策划、基于惩罚的非常似然训练损失及指令微调等多种技术。跨不同数据集和场景的广泛实验证明了我们方法的有效性。尤其值得注意的是,结合正负样本的指令微调作为一种极具潜力的方法脱颖而出,在保护隐私数据的同时增强了模型知识。本研究凸显了大型语言模型作为稳健隐私保护学习者的潜力。