TuneShift-KD: Knowledge Distillation and Transfer for Fine-tuned Models

To embed domain-specific or specialized knowledge into pre-trained foundation models, fine-tuning using techniques such as parameter efficient fine-tuning (e.g. LoRA) is a common practice. However, as new LLM architectures and pre-trained models emerge, transferring this specialized knowledge to newer models becomes an important task. In many scenarios, the original specialized data may be unavailable due to privacy or commercial restrictions, necessitating distillation and transfer of this specialized knowledge from the fine-tuned base model to a different pre-trained model. We present TuneShift-KD, a novel approach that automatically distills specialized knowledge from a fine-tuned model to a target model using only a few examples representative of the specialized information. Our key insight is that specialized knowledge can be identified through perplexity differences between base and fine-tuned models: prompts where the fine-tuned model responds confidently (low perplexity), but the base model struggles (high perplexity), indicate queries corresponding to the specialized knowledge learned by the fine-tuned model. TuneShift-KD leverages this insight to create a synthetic training dataset to transfer the specialized knowledge. Using an iterative process, TuneShift-KD generates more prompts similar to those that generated responses with specialized knowledge. TuneShift-KD does not require training discriminators or access to training datasets. It is an automated approach that only requires the initial fine-tuned and base models and a few representative prompts. Our experiments demonstrate that models fine-tuned using TuneShift-KD achieve higher accuracy than prior approaches, enabling ease of deployment and more effective transfer of the specialized knowledge.

翻译：为了将领域特定或专门知识嵌入预训练基础模型中，使用参数高效微调（如LoRA）等技术进行微调是一种常见做法。然而，随着新的大语言模型架构和预训练模型的出现，将这些专门知识迁移至新模型成为一项重要任务。在许多场景下，原始专门数据可能因隐私或商业限制而无法获取，因此需要将微调基础模型中的专门知识蒸馏并迁移至不同的预训练模型。我们提出了TuneShift-KD，一种新颖的方法，该方法仅使用少量代表专门信息的示例，即可自动将专门知识从微调模型蒸馏至目标模型。我们的关键见解在于：通过基础模型与微调模型之间的困惑度差异可以识别专门知识——那些微调模型能自信回答（低困惑度）、但基础模型难以应对（高困惑度）的提示，即对应于微调模型所学习的专门知识查询。TuneShift-KD利用这一见解，创建合成训练数据集以迁移专门知识。通过迭代过程，TuneShift-KD生成更多与产生专门知识响应的提示相似的提示。TuneShift-KD无需训练判别器或访问训练数据集。这是一种自动化方法，仅需初始微调模型、基础模型以及少量代表性提示。实验表明，使用TuneShift-KD微调的模型相比先前方法实现了更高精度，从而简化部署流程并实现专门知识的更有效迁移。