Large Language Models (LLMs) are powerful tools for natural language processing, enabling novel applications and user experiences. However, to achieve optimal performance, LLMs often require adaptation with private data, which poses privacy and security challenges. Several techniques have been proposed to adapt LLMs with private data, such as Low-Rank Adaptation (LoRA), Soft Prompt Tuning (SPT), and In-Context Learning (ICL), but their comparative privacy and security properties have not been systematically investigated. In this work, we fill this gap by evaluating the robustness of LoRA, SPT, and ICL against three types of well-established attacks: membership inference, which exposes data leakage (privacy); backdoor, which injects malicious behavior (security); and model stealing, which can violate intellectual property (privacy and security). Our results show that there is no silver bullet for privacy and security in LLM adaptation and each technique has different strengths and weaknesses.
翻译:大型语言模型(LLMs)是自然语言处理的强大工具,能够实现新颖的应用和用户体验。然而,为达到最优性能,LLMs通常需要使用私有数据进行适配,这带来了隐私和安全挑战。目前已有多种技术用于私有数据适配,如低秩适配(LoRA)、软提示微调(SPT)和上下文学习(ICL),但这些技术在隐私与安全属性方面的系统性比较研究尚属空白。本研究通过评估LoRA、SPT和ICL针对三类成熟攻击的鲁棒性填补了这一空白:可暴露数据泄露的成员推断攻击(隐私问题)、可注入恶意行为的后门攻击(安全问题),以及可能侵犯知识产权的模型窃取攻击(隐私与安全问题)。研究结果表明,在LLM适配中不存在保障隐私与安全的万能方法,每种技术各有其优劣。