Recent language models have achieved impressive performance in natural language tasks by incorporating instructions with task input during fine-tuning. Since all samples in the same natural language task can be explained with the same task instructions, many instruction datasets only provide a few instructions for the entire task, without considering the input of each example in the task. However, this approach becomes ineffective in complex multi-turn dialogue generation tasks, where the input varies highly with each turn as the dialogue context changes, so that simple task instructions cannot improve the generation performance. To address this limitation, we introduce a context-based instruction fine-tuning framework for each multi-turn dialogue which generates both responses and instructions based on the previous context as input. During the evaluation, the model generates instructions based on the previous context to self-guide the response. The proposed framework produces comparable or even outstanding results compared to the baselines by aligning instructions to the input during fine-tuning with the instructions in quantitative evaluations on dialogue benchmark datasets with reduced computation budget.
翻译:近期语言模型通过在微调时将指令与任务输入相结合,在自然语言任务中取得了显著性能。由于同一自然语言任务的所有样本均可通过相同的任务指令进行解释,许多指令数据集仅为整个任务提供少量指令,而未考虑任务中每个示例的输入。然而,这种方法在复杂的多轮对话生成任务中效果不佳——此类任务中随着对话语境变化,每轮输入存在高度差异,简单的任务指令无法提升生成性能。为解决这一局限,我们提出一种基于语境的指令微调框架:针对每个多轮对话,模型基于历史语境生成回复及对应指令。在评估阶段,模型依据历史语境生成指令以自我引导回复生成。通过在对话基准数据集上进行定量评估(并降低计算成本),本框架通过将指令与微调时的输入对齐,相较于基线方法取得了相当甚至更优的结果。