Personalized Dialogue Generation (PDG) aims to create coherent responses according to roles or personas. Traditional PDG relies on external role data, which can be scarce and raise privacy concerns. Approaches address these issues by extracting role information from dialogue history, which often fail to generically model roles in continuous space. To overcome these limitations, we introduce a novel framework \textbf{MO}dels \textbf{R}oles from \textbf{P}ersonalized Dialogue \textbf{H}istory by \textbf{E}xploring and \textbf{U}tilizing Latent \textbf{S}pace (MORPHEUS) through a three-stage training process. Specifically, we create a persona codebook to represent roles in latent space compactly, and this codebook is used to construct a posterior distribution of role information. This method enables the model to generalize across roles, allowing the generation of personalized dialogues even for unseen roles. Experiments on both Chinese and English datasets demonstrate that MORPHEUS enhances the extraction of role information, and improves response generation without external role data. Additionally, MORPHEUS can be considered an efficient fine-tuning for large language models.
翻译:个性化对话生成(PDG)旨在根据角色或人设生成连贯的回复。传统的PDG依赖于外部角色数据,这些数据可能稀缺并引发隐私担忧。现有方法通过从对话历史中提取角色信息来解决这些问题,但往往难以在连续空间中泛化地建模角色。为了克服这些限制,我们引入了一个新颖的框架——通过探索和利用潜在空间从个性化对话历史中建模角色(MORPHEUS),该框架采用三阶段训练过程。具体而言,我们创建了一个人设码本,用于在潜在空间中紧凑地表示角色,并利用该码本构建角色信息的后验分布。该方法使模型能够跨角色泛化,即使对于未见过的角色也能生成个性化对话。在中文和英文数据集上的实验表明,MORPHEUS增强了对角色信息的提取能力,并在无需外部角色数据的情况下改善了回复生成。此外,MORPHEUS可被视为大型语言模型的一种高效微调方法。