Facing Off World Model Backbones: RNNs, Transformers, and S4

World models are a fundamental component in model-based reinforcement learning (MBRL). To perform temporally extended and consistent simulations of the future in partially observable environments, world models need to possess long-term memory. However, state-of-the-art MBRL agents, such as Dreamer, predominantly employ recurrent neural networks (RNNs) as their world model backbone, which have limited memory capacity. In this paper, we seek to explore alternative world model backbones for improving long-term memory. In particular, we investigate the effectiveness of Transformers and Structured State Space Sequence (S4) models, motivated by their remarkable ability to capture long-range dependencies in low-dimensional sequences and their complementary strengths. We propose S4WM, the first world model compatible with parallelizable SSMs including S4 and its variants. By incorporating latent variable modeling, S4WM can efficiently generate high-dimensional image sequences through latent imagination. Furthermore, we extensively compare RNN-, Transformer-, and S4-based world models across four sets of environments, which we have tailored to assess crucial memory capabilities of world models, including long-term imagination, context-dependent recall, reward prediction, and memory-based reasoning. Our findings demonstrate that S4WM outperforms Transformer-based world models in terms of long-term memory, while exhibiting greater efficiency during training and imagination. These results pave the way for the development of stronger MBRL agents.

翻译：世界模型是基于模型的强化学习（MBRL）中的基础组件。为了在部分可观测环境中对未来进行时间延展且一致的模拟，世界模型需要具备长期记忆能力。然而，当前最先进的MBRL智能体（如Dreamer）主要采用循环神经网络（RNN）作为其世界模型骨干，而RNN的记忆容量有限。本文旨在探索替代性世界模型骨干以改进长期记忆。具体而言，受Transformer和结构化状态空间序列（S4）模型在低维序列中捕捉长程依赖的卓越能力及其互补优势的启发，我们研究了这两种模型的有效性。我们提出S4WM——首个兼容包括S4及其变体在内的可并行化状态空间模型（SSM）的世界模型。通过引入潜在变量建模，S4WM能够通过潜在想象高效生成高维图像序列。此外，我们在四组定制化环境上系统比较了基于RNN、Transformer和S4的世界模型——这些环境专门用于评估世界模型的关键记忆能力，包括长期想象、上下文依赖回忆、奖励预测及基于记忆的推理。研究结果表明，S4WM在长期记忆方面优于基于Transformer的世界模型，并在训练和想象过程中展现出更高效率。这些结果为开发更强大的MBRL智能体奠定了基础。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日