RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models

Zekun Moore Wang,Zhongyuan Peng,Haoran Que,Jiaheng Liu,Wangchunshu Zhou,Yuhan Wu,Hongcheng Guo,Ruitong Gan,Zehao Ni,Jian Yang,Man Zhang,Zhaoxiang Zhang,Wanli Ouyang,Ke Xu,Stephen W. Huang,Jie Fu,Junran Peng

from arxiv, 30 pages, repo at https://github.com/InteractiveNLP-Team/RoleLLM-public

The advent of Large Language Models (LLMs) has paved the way for complex tasks such as role-playing, which enhances user interactions by enabling models to imitate various characters. However, the closed-source nature of state-of-the-art LLMs and their general-purpose training limit role-playing optimization. In this paper, we introduce RoleLLM, a framework to benchmark, elicit, and enhance role-playing abilities in LLMs. RoleLLM comprises four stages: (1) Role Profile Construction for 100 roles; (2) Context-Based Instruction Generation (Context-Instruct) for role-specific knowledge extraction; (3) Role Prompting using GPT (RoleGPT) for speaking style imitation; and (4) Role-Conditioned Instruction Tuning (RoCIT) for fine-tuning open-source models along with role customization. By Context-Instruct and RoleGPT, we create RoleBench, the first systematic and fine-grained character-level benchmark dataset for role-playing with 168,093 samples. Moreover, RoCIT on RoleBench yields RoleLLaMA (English) and RoleGLM (Chinese), significantly enhancing role-playing abilities and even achieving comparable results with RoleGPT (using GPT-4).

翻译：大型语言模型（LLM）的出现为角色扮演等复杂任务铺平了道路，该能力通过使模型能够模仿各类角色来增强用户交互体验。然而，当前最先进的大型语言模型因其闭源特性及通用训练目标，限制了其在角色扮演任务上的优化空间。本文提出RoleLLM框架，用于系统评估、激发并增强大型语言模型的角色扮演能力。该框架包含四个阶段：（1）构建涵盖100个角色的角色画像库；（2）基于上下文的指令生成（Context-Instruct）以提取角色特定知识；（3）采用GPT的角色提示技术（RoleGPT）实现语言风格模仿；（4）基于角色条件的指令微调（RoCIT）对开源模型进行细粒度角色定制化训练。通过Context-Instruct与RoleGPT，我们构建了RoleBench——首个系统化、细粒度的角色级角色扮演基准数据集，包含168,093个样本。进一步在RoleBench上应用RoCIT训练得到RoleLLaMA（英文模型）与RoleGLM（中文模型），这些模型在角色扮演能力上获得显著提升，部分场景甚至达到与RoleGPT（基于GPT-4）相当的效果。

相关内容

MoDELS

关注 0

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日