Long-term memory plays a critical role in personal interaction, considering long-term memory can better leverage world knowledge, historical information, and preferences in dialogues. Our research introduces PerLTQA, an innovative QA dataset that combines semantic and episodic memories, including world knowledge, profiles, social relationships, events, and dialogues. This dataset is collected to investigate the use of personalized memories, focusing on social interactions and events in the QA task. PerLTQA features two types of memory and a comprehensive benchmark of 8,593 questions for 30 characters, facilitating the exploration and application of personalized memories in Large Language Models (LLMs). Based on PerLTQA, we propose a novel framework for memory integration and generation, consisting of three main components: Memory Classification, Memory Retrieval, and Memory Synthesis. We evaluate this framework using five LLMs and three retrievers. Experimental results demonstrate that BERT-based classification models significantly outperform LLMs such as ChatGLM3 and ChatGPT in the memory classification task. Furthermore, our study highlights the importance of effective memory integration in the QA task.
翻译:长时记忆在个人交互中扮演关键角色,因其能有效利用对话中的世界知识、历史信息和偏好。本研究提出了PerLTQA,一个融合语义记忆与情景记忆(包括世界知识、个人画像、社交关系、事件及对话)的创新性问答数据集。该数据集旨在探究面向社交互动与事件的个性化记忆在问答任务中的应用,包含两种记忆类型及覆盖30个角色的8593个问题的综合基准,为大型语言模型(LLMs)中个性化记忆的探索与应用提供支持。基于PerLTQA,我们提出了由记忆分类、记忆检索与记忆合成三大模块构成的记忆整合与生成新框架,并使用五种LLMs和三种检索器进行评测。实验结果表明,BERT系列分类模型在记忆分类任务中显著优于ChatGLM3、ChatGPT等LLMs。此外,本研究揭示了有效记忆整合在问答任务中的关键作用。