The success of ChatGPT validates the potential of large language models (LLMs) in artificial general intelligence (AGI). Subsequently, the release of LLMs has sparked the open-source community's interest in instruction-tuning, which is deemed to accelerate ChatGPT's replication process. However, research on instruction-tuning LLMs in Chinese, the world's most spoken language, is still in its early stages. Therefore, this paper makes an in-depth empirical study of instruction-tuning LLMs in Chinese, which can serve as a cookbook that provides valuable findings for effectively customizing LLMs that can better respond to Chinese instructions. Specifically, we systematically explore the impact of LLM bases, parameter-efficient methods, instruction data types, which are the three most important elements for instruction-tuning. Besides, we also conduct experiment to study the impact of other factors, e.g., chain-of-thought data and human-value alignment. We hope that this empirical study can make a modest contribution to the open Chinese version of ChatGPT. This paper will release a powerful Chinese LLMs that is comparable to ChatGLM. The code and data are available at https://github.com/PhoebusSi/Alpaca-CoT.
翻译:ChatGPT的成功验证了大语言模型在通用人工智能领域的潜力。随后,开源社区对大语言模型的发布引发了对其指令微调技术的广泛关注,这被认为能加速ChatGPT的复制过程。然而,针对全球使用最广泛的中文环境的指令微调研究仍处于早期阶段。本文对中文大语言模型的指令微调进行了深入的实证研究,可作为一本实用指南,为有效定制能更好响应中文指令的模型提供重要发现。具体而言,我们系统探究了大语言模型基座、参数高效方法、指令数据类型这三个指令微调核心要素的影响。此外,我们还实验研究了思维链数据、人类价值观对齐等其他因素的作用。期待这项实证研究能为开源中文版ChatGPT的发展贡献绵薄之力。本文将发布一个与ChatGLM性能相当的高性能中文大语言模型,相关代码和数据已开源至https://github.com/PhoebusSi/Alpaca-CoT。