Most open-domain dialogue systems suffer from forgetting important information, especially in a long-term conversation. Existing works usually train the specific retriever or summarizer to obtain key information from the past, which is time-consuming and highly depends on the quality of labeled data. To alleviate this problem, we propose to recursively generate summaries/ memory using large language models (LLMs) to enhance long-term memory ability. Specifically, our method first stimulates LLMs to memorize small dialogue contexts and then recursively produce new memory using previous memory and following contexts. Finally, the LLM can easily generate a highly consistent response with the help of the latest memory. We evaluate our method using ChatGPT and text-davinci-003, and the experiments on the widely-used public dataset show that our method can generate more consistent responses in a long-context conversation. Notably, our method is a potential solution to enable the LLM to model the extremely long context. Code and scripts will be released later.
翻译:大多数开放域对话系统在长期对话中容易遗忘重要信息。现有方法通常通过训练专用检索器或摘要器从历史对话中提取关键信息,但这类方法耗时且严重依赖标注数据质量。为解决该问题,本文提出利用大型语言模型(LLMs)递归生成摘要/记忆,以增强长期记忆能力。具体而言,该方法首先引导LLMs记忆短对话上下文,随后通过先前记忆与后续上下文递归生成新记忆,最终使LLMs能借助最新记忆轻松生成高度连贯的回复。我们基于ChatGPT与text-davinci-003评估了该方法,在广泛使用的公开数据集上的实验表明,本方法在长上下文对话中能生成更一致的回复。值得注意的是,该方法为LLMs建模超长上下文提供了潜在解决方案。代码与脚本将在后续公开。