Imagine-2-Drive：面向自动驾驶的高保真CARLA世界建模 (Imagine-2-Drive: High-Fidelity World Modeling in CARLA for Autonomous Vehicles)

In autonomous driving with image based state space, accurate prediction of future events and modeling diverse behavioral modes are essential for safety and effective decision-making. World model-based Reinforcement Learning (WMRL) approaches offers a promising solution by simulating future states from current state and actions. However, utility of world models is often limited by typical RL policies being limited to deterministic or single gaussian distribution. By failing to capture the full spectrum of possible actions, reduces their adaptability in complex, dynamic environments. In this work, we introduce Imagine-2-Drive, a framework that consists of two components, VISTAPlan, a high-fidelity world model for accurate future prediction and Diffusion Policy Actor (DPA), a diffusion based policy to model multi-modal behaviors for trajectory prediction. We use VISTAPlan to simulate and evaluate trajectories from DPA and use Denoising Diffusion Policy Optimization (DDPO) to train DPA to maximize the cumulative sum of rewards over the trajectories. We analyze the benefits of each component and the framework as a whole in CARLA with standard driving metrics. As a consequence of our twin novelties- VISTAPlan and DPA, we significantly outperform the state of the art (SOTA) world models on standard driving metrics by 15% and 20% on Route Completion and Success Rate respectively.

翻译：在基于图像状态空间的自动驾驶中，对未来事件的准确预测以及对多样化行为模式的建模，对于安全性和有效决策至关重要。基于世界模型的强化学习方法通过从当前状态和动作模拟未来状态，提供了一种有前景的解决方案。然而，世界模型的效用常受限于典型的强化学习策略仅局限于确定性或单高斯分布。由于未能捕捉全部可能的动作范围，这降低了它们在复杂动态环境中的适应性。在本工作中，我们提出了Imagine-2-Drive框架，该框架包含两个组件：VISTAPlan——一个用于准确未来预测的高保真世界模型，以及Diffusion Policy Actor——一个基于扩散的策略，用于对轨迹预测中的多模态行为进行建模。我们使用VISTAPlan来模拟和评估来自DPA的轨迹，并使用Denoising Diffusion Policy Optimization来训练DPA，以最大化轨迹上的累积奖励总和。我们在CARLA仿真环境中，使用标准驾驶指标分析了每个组件及整个框架的优势。得益于VISTAPlan和DPA这两项新颖设计，我们在标准驾驶指标上显著超越了现有最佳世界模型，其中路线完成率和成功率分别提升了15%和20%。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日