Previous studies have shown that demonstrations can significantly help Large Language Models (LLMs ) perform better on the given tasks. However, this so-called In-Context Learning ( ICL ) ability is very sensitive to the presenting context, and often dozens of demonstrations are needed. In this work, we investigate if we can reduce the shot number while still maintaining a competitive performance. We present SeCoKD, a self-Knowledge Distillation ( KD ) training framework that aligns the student model with a heavily prompted variation, thereby increasing the utilization of a single demonstration. We experiment with the SeCoKD across three LLMs and six benchmarks focusing mainly on reasoning tasks. Results show that our method outperforms the base model and Supervised Fine-tuning ( SFT ), especially in zero-shot and one-shot settings by 30% and 10%, respectively. Moreover, SeCoKD brings little negative artifacts when evaluated on new tasks, which is more robust than Supervised Fine-tuning.
翻译:先前研究表明,演示样本能够显著提升大语言模型(LLMs)在特定任务上的表现。然而,这种所谓的上下文学习(ICL)能力对呈现的上下文环境极为敏感,通常需要数十个演示样本才能生效。本工作旨在探究能否在减少演示样本数量的同时仍保持模型竞争力。我们提出SeCoKD——一种自知识蒸馏(KD)训练框架,该框架通过将学生模型与经过强提示处理的变体对齐,从而提升单个演示样本的利用率。我们在三种大语言模型和六个主要关注推理任务的基准测试上对SeCoKD进行了实验验证。结果表明,我们的方法在零样本和单样本设定下分别以30%和10%的优势超越了基础模型及监督微调(SFT)的表现。此外,SeCoKD在新任务评估中产生的负面效应极小,其鲁棒性优于监督微调方法。