Large-scale Language Models (LLMs) are constrained by their inability to process lengthy inputs. To address this limitation, we propose the Self-Controlled Memory (SCM) system to unleash infinite-length input capacity for large-scale language models. Our SCM system is composed of three key modules: the language model agent, the memory stream, and the memory controller. The language model agent iteratively processes ultra-long inputs and stores all historical information in the memory stream. The memory controller provides the agent with both long-term memory (archived memory) and short-term memory (flash memory) to generate precise and coherent responses. The controller determines which memories from archived memory should be activated and how to incorporate them into the model input. Our SCM system can be integrated with any LLMs to enable them to process ultra-long texts without any modification or fine-tuning. Experimental results show that our SCM system enables LLMs, which are not optimized for multi-turn dialogue, to achieve multi-turn dialogue capabilities that are comparable to ChatGPT, and to outperform ChatGPT in scenarios involving ultra-long document summarization or long-term conversations. Additionally, we will supply a test set, which covers common long-text input scenarios, for evaluating the abilities of LLMs in processing long documents.~\footnote{Working in progress.}\footnote{\url{https://github.com/wbbeyourself/SCM4LLMs}}
翻译:大规模语言模型(LLMs)受限于其无法处理长文本输入的能力。为解决这一局限,我们提出了自控记忆(SCM)系统,以释放大规模语言模型的无限长度输入能力。我们的SCM系统由三个关键模块组成:语言模型智能体、记忆流和记忆控制器。语言模型智能体迭代处理超长输入,并将所有历史信息存储于记忆流中。记忆控制器为智能体提供长期记忆(存档记忆)与短期记忆(闪存记忆),以生成精准连贯的响应。控制器决定应从存档记忆中激活哪些记忆,以及如何将其融入模型输入。我们的SCM系统可集成至任意LLMs中,使其无需任何修改或微调即可处理超长文本。实验结果表明,我们的SCM系统使未针对多轮对话优化的LLMs实现了与ChatGPT相当的多轮对话能力,并在超长文档摘要或长期对话场景中优于ChatGPT。此外,我们将提供一个覆盖常见长文本输入场景的测试集,用于评估LLMs处理长文档的能力。~\footnote{工作持续进行中。}\footnote{\url{https://github.com/wbbeyourself/SCM4LLMs}}