BEYOND DIALOGUE: A Profile-Dialogue Alignment Framework Towards General Role-Playing Language Model

The rapid advancement of large language models (LLMs) has revolutionized role-playing, enabling the development of general role-playing models. However, current role-playing training has two significant issues: (I) Using a predefined role profile to prompt dialogue training for specific scenarios usually leads to inconsistencies and even conflicts between the dialogue and the profile, resulting in training biases. (II) The model learns to imitate the role based solely on the profile, neglecting profile-dialogue alignment at the sentence level. In this work, we propose a simple yet effective framework called BEYOND DIALOGUE, designed to overcome these hurdles. This framework innovatively introduces "beyond dialogue" tasks to align dialogue with profile traits based on each specific scenario, thereby eliminating biases during training. Furthermore, by adopting an innovative prompting mechanism that generates reasoning outcomes for training, the framework allows the model to achieve fine-grained alignment between profile and dialogue at the sentence level. The aforementioned methods are fully automated and low-cost. Additionally, the integration of automated dialogue and objective evaluation methods forms a comprehensive framework, paving the way for general role-playing. Experimental results demonstrate that our model excels in adhering to and reflecting various dimensions of role profiles, outperforming most proprietary general and specialized role-playing baselines. All code and datasets are available at https://github.com/yuyouyu32/BeyondDialogue.

翻译：大型语言模型（LLM）的快速发展彻底改变了角色扮演领域，推动了通用角色扮演模型的开发。然而，当前的角色扮演训练存在两个显著问题：（I）使用预定义的角色档案来引导特定场景的对话训练，通常会导致对话与档案之间存在不一致甚至冲突，从而产生训练偏差。（II）模型仅基于档案学习模仿角色，忽视了句子层面的档案-对话对齐。在本研究中，我们提出了一个简单而有效的框架，称为BEYOND DIALOGUE，旨在克服这些障碍。该框架创新性地引入了“超越对话”任务，以根据每个特定场景将对话与角色档案特征对齐，从而消除训练过程中的偏差。此外，通过采用一种创新的提示机制来生成用于训练的推理结果，该框架使模型能够在句子层面实现档案与对话的细粒度对齐。上述方法完全自动化且成本低廉。此外，自动化对话与客观评估方法的结合形成了一个全面的框架，为通用角色扮演铺平了道路。实验结果表明，我们的模型在遵循和反映角色档案的多个维度方面表现出色，超越了大多数专有的通用及专业角色扮演基线模型。所有代码和数据集均可在 https://github.com/yuyouyu32/BeyondDialogue 获取。