MUSES: The Multi-Sensor Semantic Perception Dataset for Driving under Uncertainty

Achieving level-5 driving automation in autonomous vehicles necessitates a robust semantic visual perception system capable of parsing data from different sensors across diverse conditions. However, existing semantic perception datasets often lack important non-camera modalities typically used in autonomous vehicles, or they do not exploit such modalities to aid and improve semantic annotations in challenging conditions. To address this, we introduce MUSES, the MUlti-SEnsor Semantic perception dataset for driving in adverse conditions under increased uncertainty. MUSES includes synchronized multimodal recordings with 2D panoptic annotations for 2500 images captured under diverse weather and illumination. The dataset integrates a frame camera, a lidar, a radar, an event camera, and an IMU/GNSS sensor. Our new two-stage panoptic annotation protocol captures both class-level and instance-level uncertainty in the ground truth and enables the novel task of uncertainty-aware panoptic segmentation we introduce, along with standard semantic and panoptic segmentation. MUSES proves both effective for training and challenging for evaluating models under diverse visual conditions, and it opens new avenues for research in multimodal and uncertainty-aware dense semantic perception. Our dataset and benchmark will be made publicly available.

翻译：实现自动驾驶车辆的L5级全自动驾驶，需要鲁棒的语义视觉感知系统，能够解析来自不同传感器在多种条件下的数据。然而，现有语义感知数据集往往缺乏自动驾驶车辆中重要的非摄像头模态，或未利用此类模态辅助改进恶劣条件下的语义标注。为此，我们提出MUSES——多传感器不确定驾驶条件下的语义感知数据集。MUSES包含2500张多样化天气和光照条件下同步采集的多模态影像及二维全景标注。数据集整合了帧相机、激光雷达、毫米波雷达、事件相机及惯性测量单元/全球导航卫星系统传感器。我们提出的新型两阶段全景标注协议可捕获真实场景中的类别级和实例级不确定性，并支持我们引入的不确定性感知全景分割新任务及标准语义与全景分割任务。MUSES既证明其在多样化视觉条件下训练模型的有效性，又保持对模型评估的挑战性，为多模态及不确定性感知密集语义感知研究开辟新方向。本数据集及基准测试将公开开放。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日