This paper surveys research works in the quickly advancing field of instruction tuning (IT), a crucial technique to enhance the capabilities and controllability of large language models (LLMs). Instruction tuning refers to the process of further training LLMs on a dataset consisting of \textsc{(instruction, output)} pairs in a supervised fashion, which bridges the gap between the next-word prediction objective of LLMs and the users' objective of having LLMs adhere to human instructions. In this work, we make a systematic review of the literature, including the general methodology of IT, the construction of IT datasets, the training of IT models, and applications to different modalities, domains and applications, along with an analysis on aspects that influence the outcome of IT (e.g., generation of instruction outputs, size of the instruction dataset, etc). We also review the potential pitfalls of IT along with criticism against it, along with efforts pointing out current deficiencies of existing strategies and suggest some avenues for fruitful research. Project page: github.com/xiaoya-li/Instruction-Tuning-Survey
翻译:本文综述了指令微调(Instruction Tuning, IT)这一快速发展领域的研究工作。指令微调是一种关键的技术手段,旨在增强大型语言模型(Large Language Models, LLMs)的能力与可控性。具体而言,指令微调是指通过监督学习方式,在由(指令,输出)对构成的数据集上对大型语言模型进行进一步训练,从而弥合大型语言模型以预测下一个词为目标的目标与用户希望其遵循人类指令的目标之间的差距。本文对相关文献进行了系统性回顾,涵盖指令微调的通用方法论、指令微调数据集的构建、指令微调模型的训练、以及在不同模态、领域及应用中的实践,同时分析了影响指令微调效果的因素(例如指令输出的生成、指令数据集的规模等)。此外,本文还探讨了指令微调可能存在的缺陷与相关批评,指出了现有策略的不足之处,并提出了若干富有前景的研究方向。项目页面:github.com/xiaoya-li/Instruction-Tuning-Survey