We propose MemoChat, a pipeline for refining instructions that enables large language models (LLMs) to effectively employ self-composed memos for maintaining consistent long-range open-domain conversations. We demonstrate a long-range open-domain conversation through iterative "memorization-retrieval-response" cycles. This requires us to carefully design tailored tuning instructions for each distinct stage. The instructions are reconstructed from a collection of public datasets to teach the LLMs to memorize and retrieve past dialogues with structured memos, leading to enhanced consistency when participating in future conversations. We invite experts to manually annotate a test set designed to evaluate the consistency of long-range conversations questions. Experiments on three testing scenarios involving both open-source and API-accessible chatbots at scale verify the efficacy of MemoChat, which outperforms strong baselines.
翻译:我们提出MemoChat,一种指令精炼流程,使大型语言模型(LLMs)能够有效利用自编写备忘录,以维持长程开放域对话的一致性。通过迭代的“记忆-检索-响应”循环,我们展示了长程开放域对话的实现。这要求我们为每个不同阶段精心设计针对性的调优指令。这些指令从一系列公开数据集中重构而来,以教会LLMs通过结构化备忘录记忆和检索过往对话,从而在参与未来对话时增强一致性。我们邀请专家人工标注了一个测试集,旨在评估长程对话中一致性问题。在三个涉及开源和API可访问聊天机器人的规模化测试场景中进行的实验验证了MemoChat的有效性,其表现优于强基线方法。