As large language models (LLMs) become increasingly prevalent in web services, effectively leveraging domain-specific knowledge while ensuring privacy has become critical. Existing methods, such as retrieval-augmented generation (RAG) and differentially private data synthesis, often compromise either the utility of domain knowledge or the privacy of sensitive data, limiting their applicability in specialized domains. To address these challenges, we propose \textit{Llamdex}, a novel framework that integrates privacy-preserving, domain-specific models into LLMs. Our approach significantly enhances the accuracy of domain-specific tasks, achieving up to a 26\% improvement compared to existing methods under the same differential privacy constraints. Experimental results show that Llamdex not only improves the accuracy of LLM responses but also maintains comparable inference efficiency to the original LLM, highlighting its potential for real-world applications.
翻译:随着大型语言模型(LLMs)在网络服务中日益普及,在有效利用领域特定知识的同时确保隐私已成为关键问题。现有方法,如检索增强生成(RAG)和差分隐私数据合成,常常在领域知识的效用与敏感数据的隐私之间做出妥协,限制了其在专业领域的适用性。为应对这些挑战,我们提出了 \textit{Llamdex},一种将隐私保护的领域特定模型集成到LLMs中的新型框架。我们的方法显著提升了领域特定任务的准确性,在相同的差分隐私约束下,相比现有方法实现了高达26%的性能提升。实验结果表明,Llamdex不仅提高了LLM响应的准确性,还保持了与原始LLM相当的推理效率,凸显了其在现实应用中的潜力。