With the proliferation of large language models (LLMs), the comprehensive alignment of such models across multiple tasks has emerged as a critical area of research. Existing alignment methodologies primarily address single task, such as multi-turn dialogue, coding, mathematical problem-solving, and tool usage. However, AI-driven products that leverage language models usually necessitate a fusion of these abilities to function effectively in real-world scenarios. Moreover, the considerable computational resources required for proper alignment of LLMs underscore the need for a more robust, efficient, and encompassing approach to multi-task alignment, ensuring improved generative performance. In response to these challenges, we introduce a novel technique termed Mixture-of-Instructions (MoI), which employs a strategy of instruction concatenation combined with diverse system prompts to boost the alignment efficiency of language models. We have also compiled a diverse set of seven benchmark datasets to rigorously evaluate the alignment efficacy of the MoI-enhanced language model. Our methodology was applied to the open-source Qwen-7B-chat model, culminating in the development of Qwen-SFT-MoI. This enhanced model demonstrates significant advancements in generative capabilities across coding, mathematics, and tool use tasks.
翻译:随着大语言模型(LLMs)的普及,如何实现模型在多任务场景下的全面对齐已成为重要研究方向。现有对齐方法主要聚焦于单任务场景,如多轮对话、代码生成、数学问题求解及工具使用等。然而,基于语言模型的AI驱动产品通常需要融合多种能力才能在现实场景中有效运行。此外,大语言模型对齐过程所需的巨大计算资源,凸显了开发更稳健、高效且全面的多任务对齐方法的迫切性,以提升其生成性能。针对上述挑战,我们提出名为"混合指令"(Mixture-of-Instructions, MoI)的创新技术,通过指令拼接与多样化系统提示相结合的策略,提升语言模型的对齐效率。我们还构建了包含七个不同基准数据集的评估体系,以严格检验经MoI增强后的语言模型对齐效果。将该方法应用于开源模型Qwen-7B-chat后,开发出Qwen-SFT-MoI模型。实验表明,该增强模型在代码生成、数学推理及工具使用等任务中的生成能力均取得显著提升。