Enhancing user engagement through personalization in conversational agents has gained significance, especially with the advent of large language models that generate fluent responses. Personalized dialogue generation, however, is multifaceted and varies in its definition -- ranging from instilling a persona in the agent to capturing users' explicit and implicit cues. This paper seeks to systemically survey the recent landscape of personalized dialogue generation, including the datasets employed, methodologies developed, and evaluation metrics applied. Covering 22 datasets, we highlight benchmark datasets and newer ones enriched with additional features. We further analyze 17 seminal works from top conferences between 2021-2023 and identify five distinct types of problems. We also shed light on recent progress by LLMs in personalized dialogue generation. Our evaluation section offers a comprehensive summary of assessment facets and metrics utilized in these works. In conclusion, we discuss prevailing challenges and envision prospect directions for future research in personalized dialogue generation.
翻译:通过对话代理中的个性化增强用户参与度已变得日益重要,尤其是在能够生成流畅响应的大语言模型出现之后。然而,个性化对话生成具有多面性,其定义也存在差异——从为代理注入特定人设到捕捉用户的显性与隐性线索均属此范畴。本文旨在系统性地综述个性化对话生成的最新进展,包括所采用的数据集、开发的方法以及应用的评估指标。我们涵盖了22个数据集,重点介绍了基准数据集以及具备附加特征的较新数据集。我们进一步分析了2021年至2023年间顶级会议的17篇代表性工作,并归纳出五类不同的问题类型。同时,我们阐明了大型语言模型在个性化对话生成中的最新进展。评估部分全面总结了这些工作中采用的评估维度与度量指标。最后,我们讨论了当前面临的挑战,并对个性化对话生成未来的研究方向进行了展望。