Parrot: Enhancing Multi-Turn Chat Models by Learning to Ask Questions

Impressive progress has been made on chat models based on Large Language Models (LLMs) recently; however, there is a noticeable lag in multi-turn conversations between open-source chat models (e.g., Alpaca and Vicuna) and the leading chat models (e.g., ChatGPT and GPT-4). Through a series of analyses, we attribute the lag to the lack of enough high-quality multi-turn instruction-tuning data. The available instruction-tuning data for the community are either single-turn conversations or multi-turn ones with certain issues, such as non-human-like instructions, less detailed responses, or rare topic shifts. In this paper, we address these challenges by introducing Parrot, a highly scalable solution designed to automatically generate high-quality instruction-tuning data, which are then used to enhance the effectiveness of chat models in multi-turn conversations. Specifically, we start by training the Parrot-Ask model, which is designed to emulate real users in generating instructions. We then utilize Parrot-Ask to engage in multi-turn conversations with ChatGPT across a diverse range of topics, resulting in a collection of 40K high-quality multi-turn dialogues (Parrot-40K). These data are subsequently employed to train a chat model that we have named Parrot-Chat. We demonstrate that the dialogues gathered from Parrot-Ask markedly outperform existing multi-turn instruction-following datasets in critical metrics, including topic diversity, number of turns, and resemblance to human conversation. With only 40K training examples, Parrot-Chat achieves strong performance against other 13B open-source models across a range of instruction-following benchmarks, and particularly excels in evaluations of multi-turn capabilities. We make all codes, datasets, and two versions of the Parrot-Ask model based on LLaMA2-13B and KuaiYii-13B available at https://github.com/kwai/KwaiYii/Parrot.

翻译：摘要：基于大型语言模型的对话模型近期取得了显著进展；然而，开源对话模型（如Alpaca和Vicuna）与领先对话模型（如ChatGPT和GPT-4）在多轮对话方面仍存在明显差距。通过一系列分析，我们将这一差距归因于缺乏足够高质量的多轮指令微调数据。目前社区可用的指令微调数据要么是单轮对话，要么存在某些缺陷（如指令不够拟人化、回复不够详细或话题转换频率过低）。本文通过引入Parrot（一种高度可扩展的解决方案）来应对这些挑战，该方案可自动生成高质量的指令微调数据，进而用于提升对话模型在多轮对话中的有效性。具体而言，我们首先训练Parrot-Ask模型，该模型旨在模拟真实用户生成指令。随后，我们利用Parrot-Ask模型与ChatGPT在多样化主题上进行多轮对话，最终收集到40K个高质量多轮对话（Parrot-40K）。这些数据被用于训练我们命名为Parrot-Chat的对话模型。实验表明，从Parrot-Ask采集的对话在关键指标（包括主题多样性、对话轮次数量及与人类对话的相似度）上显著优于现有的多轮指令遵循数据集。仅使用40K训练样本，Parrot-Chat在多个指令遵循基准测试中即可与13B开源模型相媲美，尤其在多轮能力评估中表现突出。我们已将全部代码、数据集及基于LLaMA2-13B和KuaiYii-13B的两个Parrot-Ask模型版本开源至https://github.com/kwai/KwaiYii/Parrot。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

14+阅读 · 2022年3月12日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日