In recent years, Dialogue-style Large Language Models (LLMs) such as ChatGPT and GPT4 have demonstrated immense potential in constructing open-domain dialogue agents. However, aligning these agents with specific characters or individuals remains a considerable challenge due to the complexities of character representation and the lack of comprehensive annotations. In this paper, we introduce the Harry Potter Dialogue (HPD) dataset, designed to advance the study of dialogue agents and character alignment. The dataset encompasses all dialogue sessions (in both English and Chinese) from the Harry Potter series and is annotated with vital background information, including dialogue scenes, speakers, character relationships, and attributes. These extensive annotations may empower LLMs to unlock character-driven dialogue capabilities. Furthermore, it can serve as a universal benchmark for evaluating how well can a LLM aligning with a specific character. We benchmark LLMs on HPD using both fine-tuning and in-context learning settings. Evaluation results reveal that although there is substantial room for improvement in generating high-quality, character-aligned responses, the proposed dataset is valuable in guiding models toward responses that better align with the character of Harry Potter.
翻译:近年来,诸如ChatGPT和GPT4等对话式大型语言模型在构建开放域对话代理方面展现出巨大潜力。然而,由于角色表征的复杂性以及全面标注的缺乏,如何使这些代理与特定角色或个体对齐仍面临重大挑战。本文介绍了哈利·波特对话数据集,旨在推动对话代理与角色对齐的研究。该数据集涵盖《哈利·波特》系列中所有对话场景(包括英文和中文),并标注了关键背景信息,如对话场景、说话者、角色关系及属性。这些丰富的标注信息或可赋能语言模型解锁基于角色的对话能力。此外,该数据集可作为通用基准,评估语言模型与特定角色对齐的效能。我们分别采用微调和上下文学习两种设置,基于该数据集对语言模型进行了基准测试。评估结果表明,尽管生成高质量、与角色对齐的回复仍有较大提升空间,但所提出的数据集在引导模型生成更符合哈利·波特角色特征的回复方面具有重要价值。