Recent approaches have attempted to personalize dialogue systems by leveraging profile information into models. However, this knowledge is scarce and difficult to obtain, which makes the extraction/generation of profile information from dialogues a fundamental asset. To surpass this limitation, we introduce the Profile Generation Task (PGTask). We contribute with a new dataset for this problem, comprising profile sentences aligned with related utterances, extracted from a corpus of dialogues. Furthermore, using state-of-the-art methods, we provide a benchmark for profile generation on this novel dataset. Our experiments disclose the challenges of profile generation, and we hope that this introduces a new research direction.
翻译:近期研究尝试通过将用户画像信息融入模型来实现对话系统的个性化。然而,这类知识稀缺且难以获取,这使得从对话中提取/生成用户画像信息成为一项基础性工作。为突破这一局限,我们提出了用户画像生成任务(Profile Generation Task, PGTask)。我们为该问题贡献了一个新数据集,其中包含从对话语料库中提取的、与相关话语对齐的画像语句。此外,我们采用最先进的方法,在该新数据集上建立了用户画像生成的基准测试。实验揭示了用户画像生成的挑战性,我们期望这能开启一个新的研究方向。