Perception is a cornerstone of autonomous driving, enabling vehicles to understand their surroundings and make safe, reliable decisions. Developing robust perception algorithms requires large-scale, high-quality datasets that cover diverse driving conditions and support thorough evaluation. Existing datasets often lack a high-fidelity digital twin, limiting systematic testing, edge-case simulation, sensor modification, and sim-to-real evaluations. To address this gap, we present DrivIng, a large-scale multimodal dataset with a complete geo-referenced digital twin of a ~18 km route spanning urban, suburban, and highway segments. Our dataset provides continuous recordings from six RGB cameras, one LiDAR, and high-precision ADMA-based localization, captured across day, dusk, and night. All sequences are annotated at 10 Hz with 3D bounding boxes and track IDs across 12 classes, yielding ~1.2 million annotated instances. Alongside the benefits of a digital twin, DrivIng enables a 1-to-1 transfer of real traffic into simulation, preserving agent interactions while enabling realistic and flexible scenario testing. To support reproducible research and robust validation, we benchmark DrivIng with state-of-the-art perception models and publicly release the dataset, digital twin, HD map, and codebase.
翻译:感知是自动驾驶的基石,它使车辆能够理解周围环境并做出安全可靠的决策。开发鲁棒的感知算法需要大规模、高质量的数据集,这些数据集应涵盖多样化的驾驶条件并支持全面的评估。现有数据集通常缺乏高保真度的数字孪生,限制了系统性测试、边缘案例模拟、传感器修改以及仿真到现实的评估。为弥补这一不足,我们提出了DrivIng,这是一个大规模多模态数据集,包含一条约18公里路线(涵盖城市、郊区和高速公路路段)的完整地理参考数字孪生。我们的数据集提供了来自六个RGB摄像头、一个激光雷达以及基于高精度ADMA的定位系统的连续记录,采集时间覆盖白天、黄昏和夜晚。所有序列均以10 Hz的频率进行了标注,包含12个类别的3D边界框和轨迹ID,产生了约120万个标注实例。除了数字孪生的优势外,DrivIng还能将真实交通1:1地迁移到仿真环境中,在保留智能体交互的同时,实现逼真且灵活的场景测试。为支持可重复的研究和鲁棒的验证,我们使用最先进的感知模型对DrivIng进行了基准测试,并公开了数据集、数字孪生、高精地图和代码库。