With extensive pre-trained knowledge and high-level general capabilities, large language models (LLMs) emerge as a promising avenue to augment reinforcement learning (RL) in aspects such as multi-task learning, sample efficiency, and high-level task planning. In this survey, we provide a comprehensive review of the existing literature in LLM-enhanced RL and summarize its characteristics compared to conventional RL methods, aiming to clarify the research scope and directions for future studies. Utilizing the classical agent-environment interaction paradigm, we propose a structured taxonomy to systematically categorize LLMs' functionalities in RL, including four roles: information processor, reward designer, decision-maker, and generator. For each role, we summarize the methodologies, analyze the specific RL challenges that are mitigated, and provide insights into future directions. Lastly, a comparative analysis of each role, potential applications, prospective opportunities, and challenges of the LLM-enhanced RL are discussed. By proposing this taxonomy, we aim to provide a framework for researchers to effectively leverage LLMs in the RL field, potentially accelerating RL applications in complex applications such as robotics, autonomous driving, and energy systems.
翻译:凭借广泛预训练知识和高层次通用能力,大语言模型(LLMs)为增强强化学习(RL)在多任务学习、样本效率和高层任务规划等方面提供了前景广阔的途径。本综述系统梳理了LLM增强RL领域的现有文献,总结了其相较于传统RL方法的特点,旨在明确研究范畴与未来方向。基于经典的智能体-环境交互范式,我们提出结构化分类体系,将LLMs在RL中的功能系统性地归纳为四种角色:信息处理器、奖励设计器、决策器与生成器。针对每种角色,我们总结了方法论框架,分析了其缓解的具体RL挑战,并展望了未来研究方向。最后,本文对各角色进行了对比分析,探讨了LLM增强RL的潜在应用场景、发展机遇与面临挑战。通过构建此分类体系,我们旨在为研究者提供有效利用LLMs强化RL研究的框架,有望加速RL在机器人、自动驾驶和能源系统等复杂场景中的应用进程。