Large Language Models (LLM) are usually fine-tuned to participate in dyadic or two-party dialogues, which can not adapt well to multi-party dialogues (MPD), which hinders their applications in such scenarios including multi-personal meetings, discussions and daily communication. Previous LLM-based researches mainly focus on the multi-agent framework, while their base LLMs are still pairwisely fine-tuned. In this work, we design a multi-party fine-tuning framework (MuPaS) for LLMs on the multi-party dialogue datasets, and prove such a straightforward framework can let the LLM align with the multi-party conversation style efficiently and effectively. We also design two training strategies which can convert MuPaS into the MPD simulator. Substantial experiments show that MuPaS can achieve state-of-the-art multi-party response, higher accuracy of the-next-speaker prediction, higher human and automatic evaluated utterance qualities, and can even generate reasonably with out-of-distribution scene, topic and role descriptions. The MuPaS framework bridges the LLM training with more complicated multi-party applications, such as conversation generation, virtual rehearsal or meta-universe.
翻译:大型语言模型通常被微调用于参与二元或双方对话,这难以适应多方对话场景,从而限制了其在多人会议、讨论及日常交流等场景中的应用。先前基于LLM的研究主要集中于多智能体框架,但其基础LLM仍采用成对微调方式。本研究设计了一种面向多方对话数据集的语言模型多方微调框架,并证明这种直接框架能使LLM高效且有效地对齐多方对话风格。我们还设计了两种训练策略,可将MuPaS转化为多方对话模拟器。大量实验表明:MuPaS能实现最先进的多方响应生成、更高的下一位说话者预测准确率、更高的人类与自动评估话语质量,甚至能在分布外场景、主题和角色描述下生成合理对话。该框架将LLM训练与更复杂的多方应用场景相连接,例如对话生成、虚拟排练或元宇宙应用。