Large Language Models are Superpositions of All Characters: Attaining Arbitrary Role-play via Self-Alignment

Considerable efforts have been invested in augmenting the role-playing proficiency of open-source large language models (LLMs) by emulating proprietary counterparts. Nevertheless, we posit that LLMs inherently harbor role-play capabilities, owing to the extensive knowledge of characters and potential dialogues ingrained in their vast training corpora. Thus, in this study, we introduce Ditto, a self-alignment method for role-play. Ditto capitalizes on character knowledge, encouraging an instruction-following LLM to simulate role-play dialogues as a variant of reading comprehension. This method creates a role-play training set comprising 4,000 characters, surpassing the scale of currently available datasets by tenfold regarding the number of roles. Subsequently, we fine-tune the LLM using this self-generated dataset to augment its role-playing capabilities. Upon evaluating our meticulously constructed and reproducible role-play benchmark and the roleplay subset of MT-Bench, Ditto, in various parameter scales, consistently maintains a consistent role identity and provides accurate role-specific knowledge in multi-turn role-play conversations. Notably, it outperforms all open-source role-play baselines, showcasing performance levels comparable to advanced proprietary chatbots. Furthermore, we present the first comprehensive cross-supervision alignment experiment in the role-play domain, revealing that the intrinsic capabilities of LLMs confine the knowledge within role-play. Meanwhile, the role-play styles can be easily acquired with the guidance of smaller models. We open-source related resources at https://github.com/OFA-Sys/Ditto.

翻译：大量研究致力于通过模仿专有模型来增强开源大型语言模型（LLMs）的角色扮演能力。然而，我们认为LLMs本身便天生具备角色扮演能力，这是因为其庞大的训练语料中蕴含了广泛的人物知识与潜在对话模式。因此，本研究提出Ditto——一种用于角色扮演的自对齐方法。Ditto利用人物知识，鼓励遵循指令的LLM将角色扮演对话模拟为阅读理解的一种变体。该方法构建了一个包含4,000个人物的角色扮演训练集，其角色数量规模较现有数据集提升十倍。随后，我们使用该自生成数据集微调LLM以增强其角色扮演能力。在精心构建且可复现的角色扮演基准测试以及MT-Bench的角色扮演子集评估中，不同参数规模的Ditto在多轮角色扮演对话中均能保持稳定的角色身份，并提供准确的角色特定知识。值得注意的是，它超越了所有开源角色扮演基线模型，展现出与先进专有聊天机器人相当的性能水平。此外，我们首次在角色扮演领域开展了全面的跨监督对齐实验，揭示出LLMs的内在能力限制了角色扮演中的知识范围，而角色扮演风格则可在较小模型的引导下轻松习得。相关资源已在https://github.com/OFA-Sys/Ditto开源。

相关内容

大语言模型

关注 67

大语言模型是基于海量文本数据训练的深度学习模型。它不仅能够生成自然语言文本，还能够深入理解文本含义，处理各种自然语言任务，如文本摘要、问答、翻译等。2023年，大语言模型及其在人工智能领域的应用已成为全球科技研究的热点，其在规模上的增长尤为引人注目，参数量已从最初的十几亿跃升到如今的一万亿。参数量的提升使得模型能够更加精细地捕捉人类语言微妙之处，更加深入地理解人类语言的复杂性。在过去的一年里，大语言模型在吸纳新知识、分解复杂任务以及图文对齐等多方面都有显著提升。随着技术的不断成熟，它将不断拓展其应用范围，为人类提供更加智能化和个性化的服务，进一步改善人们的生活和生产方式。

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日