Large Language Models (LLMs) are smart but forgetful. Recent studies, (e.g., (Bubeck et al., 2023)) on modern LLMs have shown that they are capable of performing amazing tasks typically necessitating human-level intelligence. However, unlike humans, frozen LLMs do not improve over time; they neither acquire new knowledge nor learn from their successes or failures. Some approaches to improving the intelligence of LLMs include fine-tuning models based on problem-solving performance (Zelikman et al., 2022), and building bigger and more sophisticated models (Bubeck et al., 2023). However, these methods have the drawback of requiring substantial data and computational resources to retrain existing models. In this paper, we explore the use of Retrieval Augmented Generation, also known as RAG (Lewis et al., 2021) to improve problem-solving performance. We propose ARM-RAG (Auxiliary Rationale Memory for Retrieval Augmented Generation), a system that learns from its successes without incurring high training costs. We demonstrate that the storage and subsequent retrieval of reasoning chains have a positive influence on performance in grade-school math problems.
翻译:大型语言模型(LLM)虽具智能却易于遗忘。近期对现代LLM的研究(如Bubeck等人,2023)表明,这类模型能够执行通常需要人类智能的惊人任务。然而与人类不同,参数冻结的LLM无法随时间推移而进步:它们既不能获取新知识,也无法从成功或失败中学习。目前提升LLM智能的方法包括:基于问题解决性能进行模型微调(Zelikman等人,2022),以及构建更大更复杂的模型(Bubeck等人,2023)。但这些方法存在缺陷,需要大量数据和计算资源来重新训练现有模型。本文探索了利用检索增强生成(RAG,Lewis等人,2021)来提升问题解决性能的方法。我们提出ARM-RAG(面向检索增强生成的辅助推理记忆)系统,该系统能在不产生高训练成本的情况下从成功经验中学习。实验证明,存储并后续检索推理链对小学数学问题的求解性能具有积极影响。