While large language models (LLMs) are empowered with broad knowledge, their task-specific performance is often suboptimal. It necessitates fine-tuning LLMs with task-specific data, but such data may be inaccessible due to privacy concerns. In this paper, we propose a novel approach to enhance LLMs with smaller language models (SLMs) that are trained on clients using their private task-specific data. To enable mutual enhancement between LLMs and SLMs, we propose CrossLM, where the SLMs promote the LLM to generate task-specific high-quality data, and both the LLM and SLMs are enhanced with the generated data. We evaluate CrossLM using publicly accessible language models across a range of benchmark tasks. The results demonstrate that CrossLM significantly enhances the task-specific performance of SLMs on clients and the LLM on the cloud server simultaneously while preserving the LLM's generalization capability.
翻译:尽管大型语言模型(LLMs)具备广泛的知识,但其在特定任务上的性能往往不够理想。这要求使用特定任务数据对LLMs进行微调,然而此类数据可能因隐私问题而无法获取。本文提出了一种新颖方法,利用在客户端使用私有任务数据训练的小型语言模型(SLMs)来增强LLMs。为实现LLMs与SLMs之间的相互增强,我们提出了CrossLM框架:SLMs促进LLM生成特定任务的高质量数据,而LLM和SLMs均通过生成的数据得到增强。我们使用公开可用的语言模型在多个基准任务上对CrossLM进行了评估。结果表明,CrossLM在保持LLM泛化能力的同时,显著增强了客户端SLMs和云服务器端LLM的特定任务性能。