Large language models are strong sequence predictors, yet standard inference relies on immutable context histories. After making an error at generation step t, the model lacks an updatable memory mechanism that improves predictions for step t+1. We propose LLM-as-RNN, an inference-only framework that turns a frozen LLM into a recurrent predictor by representing its hidden state as natural-language memory. This state, implemented as a structured system-prompt summary, is updated at each timestep via feedback-driven text rewrites, enabling learning without parameter updates. Under a fixed token budget, LLM-as-RNN corrects errors and retains task-relevant patterns, effectively performing online learning through language. We evaluate the method on three sequential benchmarks in healthcare, meteorology, and finance across Llama, Gemma, and GPT model families. LLM-as-RNN significantly outperforms zero-shot, full-history, and MemPrompt baselines, improving predictive accuracy by 6.5% on average, while producing interpretable, human-readable learning traces absent in standard context accumulation.
翻译:大型语言模型是强大的序列预测器,但标准推理依赖于不可变的上下文历史。在生成步骤t出现错误后,模型缺乏可更新的记忆机制来改进步骤t+1的预测。我们提出LLM-as-RNN,这是一个仅用于推理的框架,通过将冻结LLM的隐藏状态表示为自然语言记忆,将其转变为循环预测器。该状态以结构化系统提示摘要的形式实现,通过反馈驱动的文本重写在每个时间步进行更新,从而实现无需参数更新的学习。在固定令牌预算下,LLM-as-RNN能够纠正错误并保留任务相关模式,通过语言有效执行在线学习。我们在Llama、Gemma和GPT模型系列上,针对医疗、气象和金融领域的三个序列基准测试评估了该方法。LLM-as-RNN显著优于零样本、全历史和MemPrompt基线,平均预测准确率提升6.5%,同时生成标准上下文累积中缺失的可解释、人类可读的学习轨迹。