递归语言模型 (Recursive Language Models) - 专知论文

会员服务 ·

0

上下文 · 语言模型 · 片段 · 长上下文 · 编程 ·

2025 年 12 月 31 日

Recursive Language Models

翻译：递归语言模型

Alex L. Zhang,Tim Kraska,Omar Khattab

from arxiv, 9 pages, 33 with Appendix

We study allowing large language models (LLMs) to process arbitrarily long prompts through the lens of inference-time scaling. We propose Recursive Language Models (RLMs), a general inference strategy that treats long prompts as part of an external environment and allows the LLM to programmatically examine, decompose, and recursively call itself over snippets of the prompt. We find that RLMs successfully handle inputs up to two orders of magnitude beyond model context windows and, even for shorter prompts, dramatically outperform the quality of base LLMs and common long-context scaffolds across four diverse long-context tasks, while having comparable (or cheaper) cost per query.

翻译：本研究从推理时扩展的视角出发，探索使大型语言模型（LLMs）能够处理任意长提示的方法。我们提出递归语言模型（RLMs），这是一种通用的推理策略，将长提示视为外部环境的一部分，允许LLM以编程方式检查、分解提示片段，并递归调用自身处理这些片段。我们发现，RLMs能够成功处理超出模型上下文窗口两个数量级的输入，并且即使在较短的提示上，在四个不同的长上下文任务中，其性能也显著优于基础LLMs和常见的长上下文框架，同时每次查询的成本相当（或更低）。

0

相关内容

上下文

大型语言模型的模型压缩与高效推理：综述

大型语言模型的模型压缩与高效推理：综述

专知会员服务

93+阅读 · 2024年2月17日

【NeurIPS2021】模型可解释性的符号语言基础

专知会员服务

22+阅读 · 2021年10月8日

【ICML2021】基于低秩重参数化的大规模私有学习

专知会员服务

12+阅读 · 2021年6月20日

【WWW2021】实体自适应语义依赖图立场检测

专知会员服务

22+阅读 · 2021年4月15日

【NeurIPS2020】无限可能的联合对比学习

专知会员服务

29+阅读 · 2020年10月2日

【CVPR2020-旷视】DPGN：分布传播图网络的小样本学习

【CVPR2020-旷视】DPGN：分布传播图网络的小样本学习

专知

13+阅读 · 2020年4月1日

【NeurIPS2019】图变换网络：Graph Transformer Network

【NeurIPS2019】图变换网络：Graph Transformer Network

专知

245+阅读 · 2019年11月18日

语义分割中的深度学习方法全解：从FCN、SegNet到DeepLab

语义分割中的深度学习方法全解：从FCN、SegNet到DeepLab

炼数成金订阅号

26+阅读 · 2017年7月10日

MNIST入门：贝叶斯方法

MNIST入门：贝叶斯方法

Python程序员

23+阅读 · 2017年7月3日

自然语言处理（二）机器翻译篇 (NLP: machine translation)

自然语言处理（二）机器翻译篇 (NLP: machine translation)

DeepLearning中文论坛

12+阅读 · 2015年7月1日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

46+阅读 · 2015年12月31日

Jacobi行列式和Hilbert变换中的若干问题及应用

国家自然科学基金

0+阅读 · 2014年12月31日

高维数据下的模型平均方法

国家自然科学基金

6+阅读 · 2014年12月31日

Poisson流形上的修正Hamilton方法

国家自然科学基金

0+阅读 · 2014年12月31日

一类极大加和逆优化问题的研究

国家自然科学基金

0+阅读 · 2014年12月31日

Decentralized Autoregressive Generation

Arxiv

0+阅读 · 1月6日

Emergent Introspective Awareness in Large Language Models

Arxiv

0+阅读 · 1月5日

Tessellation Localized Transfer learning for nonparametric regression

Arxiv

0+阅读 · 1月2日

Scaling Efficient LLMs

Arxiv

0+阅读 · 1月2日

BALI: Branch-Aware Loop Invariant Inference with Large Language Models

Arxiv

0+阅读 · 2025年12月31日

VIP会员

文章信息

相关主题

相关VIP内容

大型语言模型的模型压缩与高效推理：综述

大型语言模型的模型压缩与高效推理：综述

专知会员服务

93+阅读 · 2024年2月17日

【NeurIPS2021】模型可解释性的符号语言基础

专知会员服务

22+阅读 · 2021年10月8日

【ICML2021】基于低秩重参数化的大规模私有学习

专知会员服务

12+阅读 · 2021年6月20日

【WWW2021】实体自适应语义依赖图立场检测

专知会员服务

22+阅读 · 2021年4月15日

【NeurIPS2020】无限可能的联合对比学习

专知会员服务

29+阅读 · 2020年10月2日

热门VIP内容

开通专知VIP会员享更多权益服务

智能体评判者（Agent-as-a-Judge）研究综述

《空战中心自动化持续训练》报告

区块链自主智能体：标准规范、执行模型与信任边界研究

面向无人机战场调整作战训练中心

相关资讯

【CVPR2020-旷视】DPGN：分布传播图网络的小样本学习

【CVPR2020-旷视】DPGN：分布传播图网络的小样本学习

专知

13+阅读 · 2020年4月1日

【NeurIPS2019】图变换网络：Graph Transformer Network

【NeurIPS2019】图变换网络：Graph Transformer Network

专知

245+阅读 · 2019年11月18日

语义分割中的深度学习方法全解：从FCN、SegNet到DeepLab

语义分割中的深度学习方法全解：从FCN、SegNet到DeepLab

炼数成金订阅号

26+阅读 · 2017年7月10日

MNIST入门：贝叶斯方法

MNIST入门：贝叶斯方法

Python程序员

23+阅读 · 2017年7月3日

自然语言处理（二）机器翻译篇 (NLP: machine translation)

自然语言处理（二）机器翻译篇 (NLP: machine translation)

DeepLearning中文论坛

12+阅读 · 2015年7月1日

相关论文

Decentralized Autoregressive Generation

Arxiv

0+阅读 · 1月6日

Emergent Introspective Awareness in Large Language Models

Arxiv

0+阅读 · 1月5日

Tessellation Localized Transfer learning for nonparametric regression

Arxiv

0+阅读 · 1月2日

Scaling Efficient LLMs

Arxiv

0+阅读 · 1月2日

BALI: Branch-Aware Loop Invariant Inference with Large Language Models

Arxiv

0+阅读 · 2025年12月31日

相关基金

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

46+阅读 · 2015年12月31日

Jacobi行列式和Hilbert变换中的若干问题及应用

国家自然科学基金

0+阅读 · 2014年12月31日

高维数据下的模型平均方法

国家自然科学基金

6+阅读 · 2014年12月31日

Poisson流形上的修正Hamilton方法

国家自然科学基金

0+阅读 · 2014年12月31日

一类极大加和逆优化问题的研究

国家自然科学基金

0+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员