Distilling conversational skills into Small Language Models (SLMs) with approximately 1 billion parameters presents significant challenges. Firstly, SLMs have limited capacity in their model parameters to learn extensive knowledge compared to larger models. Secondly, high-quality conversational datasets are often scarce, small, and domain-specific. Addressing these challenges, we introduce a novel data distillation framework named CoDi (short for Conversational Distillation, pronounced "Cody"), allowing us to synthesize large-scale, assistant-style datasets in a steerable and diverse manner. Specifically, while our framework is task agnostic at its core, we explore and evaluate the potential of CoDi on the task of conversational grounded reasoning for question answering. This is a typical on-device scenario for specialist SLMs, allowing for open-domain model responses, without requiring the model to "memorize" world knowledge in its limited weights. Our evaluations show that SLMs trained with CoDi-synthesized data achieve performance comparable to models trained on human-annotated data in standard metrics. Additionally, when using our framework to generate larger datasets from web data, our models surpass larger, instruction-tuned models in zero-shot conversational grounded reasoning tasks.
翻译:将对话技能蒸馏至约10亿参数的小型语言模型(SLMs)面临重大挑战。首先,与更大规模模型相比,SLMs的模型参数容量有限,难以学习海量知识。其次,高质量的对话数据集往往稀缺、规模小且具有领域特定性。为应对这些挑战,我们提出了一种名为CoDi(对话蒸馏的简称,发音为"Cody")的新型数据蒸馏框架,该框架能以可引导且多样化的方式合成大规模助手风格数据集。具体而言,虽然本框架核心设计是任务无关的,但我们针对对话式接地推理问答任务探索并评估了CoDi的潜力。这是专业SLMs在设备端运行的典型场景,支持开放领域模型响应,且无需模型在其有限权重中"记忆"世界知识。评估结果表明,使用CoDi合成数据训练的SLMs在标准指标上达到了与人工标注数据训练模型相当的性能。此外,当使用本框架从网络数据生成更大规模数据集时,我们的模型在零样本对话接地推理任务中超越了更大规模的指令微调模型。