While instructions fine-tuning of large language models (LLMs) has been proven to enhance performance across various applications, the influence of the instruction dataset mixture on LLMs has not been thoroughly explored. In this study, we classify instructions into three main types: NLP downstream tasks, coding, and general chatting, and investigate their impact on LLMs. Our findings reveal that specific types of instructions are more beneficial for particular uses, while it may cause harms to other aspects, emphasizing the importance of meticulously designing the instruction mixture to maximize model performance. This study sheds light on the instruction mixture and paves the way for future research.
翻译:尽管指令微调已被证明能提升大语言模型在各种应用中的性能,但指令数据集混合方式对模型的影响尚未得到充分探索。本研究将指令分为三类:自然语言处理下游任务、代码生成和通用对话,并考察它们对大语言模型的影响。研究发现特定类型的指令对某些应用场景更有益,但可能对其他方面造成损害,这凸显了精心设计指令混合方案以最大化模型性能的重要性。本研究揭示了指令混合的内在机制,为后续研究奠定了理论基础。