GLAM: Global-Local Variation Awareness in Mamba-based World Model

Mimicking the real interaction trajectory in the inference of the world model has been shown to improve the sample efficiency of model-based reinforcement learning (MBRL) algorithms. Many methods directly use known state sequences for reasoning. However, this approach fails to enhance the quality of reasoning by capturing the subtle variation between states. Much like how humans infer trends in event development from this variation, in this work, we introduce Global-Local variation Awareness Mamba-based world model (GLAM) that improves reasoning quality by perceiving and predicting variation between states. GLAM comprises two Mambabased parallel reasoning modules, GMamba and LMamba, which focus on perceiving variation from global and local perspectives, respectively, during the reasoning process. GMamba focuses on identifying patterns of variation between states in the input sequence and leverages these patterns to enhance the prediction of future state variation. LMamba emphasizes reasoning about unknown information, such as rewards, termination signals, and visual representations, by perceiving variation in adjacent states. By integrating the strengths of the two modules, GLAM accounts for highervalue variation in environmental changes, providing the agent with more efficient imagination-based training. We demonstrate that our method outperforms existing methods in normalized human scores on the Atari 100k benchmark.

翻译：在基于模型的强化学习（MBRL）算法中，模仿真实交互轨迹进行世界模型的推理已被证明可以提高样本效率。许多方法直接使用已知的状态序列进行推理。然而，这种方法未能通过捕捉状态间的细微变化来提升推理质量。正如人类通过这种变化推断事件发展趋势一样，在本工作中，我们引入了基于Mamba的全局-局部变化感知世界模型（GLAM），该模型通过感知和预测状态间的变化来提升推理质量。GLAM包含两个基于Mamba的并行推理模块：GMamba和LMamba，它们分别在推理过程中从全局和局部视角聚焦于感知变化。GMamba专注于识别输入序列中状态间的变化模式，并利用这些模式来增强对未来状态变化的预测。LMamba则强调通过感知相邻状态的变化，来推理未知信息，如奖励、终止信号和视觉表征。通过整合两个模块的优势，GLAM能够考虑环境变化中更高价值的变化，为智能体提供更高效的基于想象的训练。我们证明了我们的方法在Atari 100k基准测试的归一化人类得分上优于现有方法。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日