扩散世界建模：视觉细节在Atari游戏中的重要性 (Diffusion for World Modeling: Visual Details Matter in Atari)

World models constitute a promising approach for training reinforcement learning agents in a safe and sample-efficient manner. Recent world models predominantly operate on sequences of discrete latent variables to model environment dynamics. However, this compression into a compact discrete representation may ignore visual details that are important for reinforcement learning. Concurrently, diffusion models have become a dominant approach for image generation, challenging well-established methods modeling discrete latents. Motivated by this paradigm shift, we introduce DIAMOND (DIffusion As a Model Of eNvironment Dreams), a reinforcement learning agent trained in a diffusion world model. We analyze the key design choices that are required to make diffusion suitable for world modeling, and demonstrate how improved visual details can lead to improved agent performance. DIAMOND achieves a mean human normalized score of 1.46 on the competitive Atari 100k benchmark; a new best for agents trained entirely within a world model. We further demonstrate that DIAMOND's diffusion world model can stand alone as an interactive neural game engine by training on static Counter-Strike: Global Offensive gameplay. To foster future research on diffusion for world modeling, we release our code, agents, videos and playable world models at https://diamond-wm.github.io.

翻译：世界模型为以安全且样本高效的方式训练强化学习智能体提供了一种前景广阔的方法。当前主流的世界模型主要通过离散潜变量序列来建模环境动态。然而，这种压缩为紧凑离散表示的过程可能会忽略对强化学习至关重要的视觉细节。与此同时，扩散模型已成为图像生成的主导方法，对成熟的离散潜变量建模方法构成了挑战。受此范式转变的启发，我们提出了DIAMOND（扩散环境梦境模型），这是一种在扩散世界模型中训练的强化学习智能体。我们分析了使扩散模型适用于世界建模的关键设计选择，并论证了提升视觉细节如何能够改善智能体性能。DIAMOND在竞争性的Atari 100k基准测试中取得了1.46的平均人类标准化分数，这创造了完全在世界模型内训练的智能体的新纪录。我们进一步证明，通过在静态的《反恐精英：全球攻势》游戏录像上进行训练，DIAMOND的扩散世界模型可独立作为交互式神经游戏引擎。为促进扩散世界建模的后续研究，我们在https://diamond-wm.github.io公开了代码、智能体、演示视频及可交互的世界模型。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

31+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日