Maintaining persona consistency is paramount in the application of open-domain dialogue systems, as exemplified by models like ChatGPT. Despite significant advancements, the limited scale and diversity of current persona dialogue datasets remain challenges to achieving robust persona-consistent dialogue models. In this study, drawing inspiration from the success of large-scale pre-training, we introduce PPDS, an open-domain persona dialogue system that employs extensive generative pre-training on a persona dialogue dataset to enhance persona consistency. Specifically, we present a persona extraction model designed to autonomously and precisely generate vast persona dialogue datasets. Additionally, we unveil a pioneering persona augmentation technique to address the invalid persona bias inherent in the constructed dataset. Both quantitative and human evaluations consistently highlight the superior response quality and persona consistency of our proposed model, underscoring its effectiveness.
翻译:在开放域对话系统(如ChatGPT等模型)的应用中,保持人物一致性至关重要。尽管已取得显著进展,但当前人物对话数据集的规模有限且多样性不足,仍是实现鲁棒的人物一致性对话模型所面临的挑战。本研究受大规模预训练成功的启发,提出了PPDS——一种开放域人物对话系统,该系统通过在人物对话数据集上进行广泛的生成式预训练来增强人物一致性。具体而言,我们提出了一种人物抽取模型,旨在自主且精确地生成大规模人物对话数据集。此外,我们揭示了一种开创性的人物增强技术,以解决所构建数据集中固有的无效人物偏见问题。定量评估与人工评估均一致表明,我们所提出模型在响应质量和人物一致性方面表现优异,突显了其有效性。