Car-STAGE：基于用户定义标准的大规模高维仿真时序数据自动生成框架 (Car-STAGE: Automated framework for large-scale high-dimensional simulated time-series data generation based on user-defined criteria)

Generating large-scale sensing datasets through photo-realistic simulation is an important aspect of many robotics applications such as autonomous driving. In this paper, we consider the problem of synchronous data collection from the open-source CARLA simulator using multiple sensors attached to vehicle based on user-defined criteria. We propose a novel, one-step framework that we refer to as Car-STAGE, based on CARLA simulator, to generate data using a graphical user interface (GUI) defining configuration parameters to data collection without any user intervention. This framework can utilize the user-defined configuration parameters such as choice of maps, number and configurations of sensors, environmental and lighting conditions etc. to run the simulation in the background, collecting high-dimensional sensor data from diverse sensors such as RGB Camera, LiDAR, Radar, Depth Camera, IMU Sensor, GNSS Sensor, Semantic Segmentation Camera, Instance Segmentation Camera, and Optical Flow Camera along with the ground-truths of the individual actors and storing the sensor data as well as ground-truth labels in a local or cloud-based database. The framework uses multiple threads where a main thread runs the server, a worker thread deals with queue and frame number and the rest of the threads processes the sensor data. The other way we derive speed up over the native implementation is by memory mapping the raw binary data into the disk and then converting the data into known formats at the end of data collection. We show that using these techniques, we gain a significant speed up over frames, under an increasing set of sensors and over the number of spawned objects.

翻译：通过逼真仿真生成大规模传感数据集是自动驾驶等众多机器人应用的重要环节。本文研究了基于用户定义标准、从开源CARLA仿真器中通过车载多传感器同步采集数据的问题。我们提出了一种基于CARLA仿真器的新型一步式框架Car-STAGE，该框架通过图形用户界面定义数据采集的配置参数，无需任何人工干预即可生成数据。本框架可利用用户定义的配置参数（如地图选择、传感器数量与配置、环境与光照条件等）在后台运行仿真，从RGB相机、激光雷达、毫米波雷达、深度相机、惯性测量单元传感器、全球导航卫星系统传感器、语义分割相机、实例分割相机以及光流相机等多种传感器采集高维传感数据，同时获取各智能体的真值标签，并将传感器数据与真值标签存储于本地或云端数据库。该框架采用多线程架构：主线程运行服务器，工作线程处理队列与帧编号，其余线程处理传感器数据。相较于原生实现，我们通过内存映射技术将原始二进制数据映射至磁盘，并在数据采集结束后统一转换为标准格式，从而实现了加速。实验表明，在传感器数量递增和生成对象增多的场景下，采用这些技术能显著提升帧处理速度。