This manuscript gives a big-picture, up-to-date overview of the field of (deep) reinforcement learning and sequential decision making, covering value-based RL, policy-gradient methods, model-based methods, and various other topics (including a very brief discussion of RL+LLMs).
翻译:本文对(深度)强化学习与序列决策领域进行了宏观且最新的综述,涵盖基于价值的强化学习、策略梯度方法、基于模型的方法以及其他各类主题(包括对强化学习与大语言模型结合的简要讨论)。